Ali Hashemi

Machine Learning Researcher | PhD | Optimizing Foundation Models & LLMs 🚀 | Efficient ML for Healthcare & Generative AI 🧬

Germany

500+ connections

Updated 6 months ago

Chat with Clera Claim or Remove

11+

Years Experience

Roles

Skills

Education

About

AI researcher specialized in optimizing foundation models and LLMs for efficiency, interpretability, and scalable deployment. Experienced in model distillation, representation learning, and generative modeling for biomedical and scientific domains. Highlights: - Extensive expertise in model distillation, sparse autoencoders, quantization, pruning, and parameter-efficient fine-tuning (PEFT). - Developed innovative distillation methods reducing model size and latency by up to 80% without accuracy loss. - Strong theoretical foundation in generative models (diffusion, consistency), Bayesian statistics, and geometric/non-convex optimization, with 10+ top-tier publications (including NeurIPS). - Applied representation learning to computational histopathology and quantum chemistry, enhancing accuracy and boosting efficiency. - Proficient software engineer skilled in Python, PyTorch, Docker, Git, AWS, GCP, CI/CD, distributed training, and model serving (TensorRT, ONNX, vLLM). - Successfully led international projects delivering impactful AI solutions in biomedical and multimodal domains. My research focuses on optimizing foundation models and LLMs using advanced representation learning methods, emphasizing efficiency, interpretability, and scalable inference. Techniques include model distillation, sparse autoencoders, sparsity-induced priors, quantization, pruning, and PEFT. My theoretical expertise covers generative modeling (diffusion, consistency), Bayesian statistics, graphical models, and geometric/non-convex optimization, demonstrated through multiple publications at prestigious conferences. Currently, I apply these methods to optimize foundation models via white-box distillation, significantly reducing model size and latency. This work directly impacts fields like computational histopathology and quantum chemistry. Previously, I developed probabilistic graphical models and inference algorithms for neuroimaging, employing sparse Bayesian learning and geometric optimization. These innovations improved analysis precision for brain mapping, brain-computer interfaces (BCIs), and E/MEG reconstruction. As an experienced software engineer, I specialize in Python, PyTorch, Docker, Git, AWS (SageMaker, EC2, S3), GCP, CI/CD pipelines, distributed training, and serving models using TensorRT, ONNX, and vLLM. My engineering has significantly improved efficiency in healthcare, scientific computing, and multimodal learning. Throughout my career, I have successfully led international projects, bridging theoretical innovation with practical, scalable AI deployment.

See Related Jobs

Based on skills & location

Get AI Resume

Generate a polished resume

Salary Benchmark

What does a Machine Learning Research Scie earn?

Experience (7 roles)

Machine Learning Research Scientist

Current

BIFOLD - Berlin Institute for the Foundations of Learning and Data · Full-time

Oct 2022 - Present · 3 yrs·Berlin, Germany · On-site

- Developed advanced techniques for optimizing foundation models, including parameter-efficient fine-tuning, model distillation, and continual learning, to enhance adaptability and performance. - Developed a subspace-based knowledge distillation approach to mitigate overfitting and spurious correlat...

Postdoctoral Research Fellow

Current

Full-time · 9 yrs 8 mos

Oct 2022 - Present · 3 yrs·On-site

ML Applied Scientist | QAI Labs | Uncertainty, Inverse Modeling and Machine Learning (UNIML)

QAI labs · Full-time

Jun 2021 - Sep 2022 · 1 yr 4 mos·Berlin, Berlin, Germany · On-site

- Created a versatile open-source software package for 3D brain source imaging, featuring standardized comparisons via custom simulations. - Developed an efficient optimization algorithm using Riemannian geometry to update source and full-structure noise covariance along the manifold of PD matrices,...

Is this your profile, Ali?

Claim it to keep it updated or request removal.

Claim or Remove

Education (2)

Technische Universität Berlin

Grade: Summa cum laude (with Distinction) Thesis title: “Advances in Hierarchical Bayesian Learning with Applications to Neuroimaging” Supervisor: Prof. Dr. Klaus-Robert Müller Co-Supervisor: Prof. Dr. Stefan Haufe Advances in Hierarchical Bayesian Learning with Applications to Neuroimaging

Sharif University of Technology

Thesis title: “Compressed Spectrum Sensing in Cognitive Radio Networks” Supervisor: Prof. Masoumeh Nasiri-Kenari Co-supervisor: Prof. Massoud Babaie-Zadeh

Skills (31)

Industry ResearchvLLMSageMakerEC2GitHubS3Transformer ModelsEvent ManagementUnsupervised LearningTest-Driven DevelopmentGitHubDistributed TrainingSlurmNeuroscienceGenerative Adversarial Networks (GANs)Gradient BoostingPeer ReviewsIntegration TestingCoachingPresentationsResearch SkillsSupervised LearningBiomedical ApplicationsLinear RegressionLogistic RegressionPEFTPythonDockerGitAWSONNX

Certifications (2)▼

Data Visualization in R with ggplot2

Git Essential Training

Publications (2)▼

Resources for my Publications

Dec 11, 2020

Joint Hierarchical Bayesian Learning of Full-structure Noise for Brain Source Imaging

Medical Imaging meets NeurIPS (Med-NeurIPS) 2020 Workshop · Dec 10, 2020

Languages (2)▼

German(Professional working proficiency)English(Full professional proficiency)

Honors & Awards (2)▼

Awarded a 4-year scholarship, Machine Learning Group, Technische Universität Berlin

Jan 2018

Associated with Technische Universität Berlin

Full Fellowship – Awarded for 3 Years, Berlin International Graduate School in Model and Simulation-based Research (BIMoS)

Issued by Technische Universität Berlin · Jan 2015

Full Fellowship for a joint and multidisciplinary Ph.D. in Computer Science and Mathematics, Technische Universität Berlin, Germany. https://www.bimos.tu-berlin.de/menue/bimos_people/phd_fellows/

Frequently Asked Questions

What is Ali Hashemi's current role?▼

Ali Hashemi is currently working as Machine Learning Research Scientist at BIFOLD - Berlin Institute for the Foundations of Learning and Data · Full-time.

Where did Ali Hashemi study?▼

Ali Hashemi studied Doctor of Philosophy - PhD, Electrical Engineering and Computer Science (Machine Learning), Mathematics at Technische Universität Berlin. They have 2 education entries on their profile.

What skills does Ali Hashemi have?▼

Ali Hashemi's top skills include Industry Research, vLLM, SageMaker, EC2, GitHub. They have 31 skills listed on their profile.

Where is Ali Hashemi based?▼

Ali Hashemi is based in Germany.