Lila Sciences logo
ML Research Scientist I/II, Multimodal Data Extraction
full-timeCambridge$176k - $304k

Summary

Location

Cambridge

Salary

$176k - $304k

Type

full-time

Explore Jobs

About this role

Your Impact at Lila


As a ML Research Scientist - Multimodal Data Extraction, you will advance Lila’s vision of scientific superintelligence by developing foundation models that autonomously read, interpret, and structure scientific knowledge across text, images, and experimental data in the physical sciences. Your research will help unify the world’s scientific information into machine-understandable form, powering reasoning, prediction, and autonomous discovery across materials science and chemistry.


What You'll Be Building



  • Research and develop AI systems that extract and structure knowledge from diverse scientific sources.

  • Design and fine-tune large language, multi-modal and specialized models for factual, interpretable data extraction.

  • Build scalable pipelines for unstructured and heterogeneous scientific data, integrating text, tables, and visuals.

  • Collaborate with domain experts to align extracted data with real-world discovery workflows.

  • Publish research that advances the state of the art in multimodal understanding and AI-driven knowledge extraction.


What You’ll Need to Succeed



  • PhD (or equivalent research experience) in Computer Science, Chemistry, Materials Science, or related field.

  • Expertise in machine learningNLP, and vision–language modeling using PyTorch and Hugging Face Transformers.

  • Proven ability to train, fine-tune, and evaluate LLMs and multimodal models for scientific data extraction.

  • Strong understanding of data structures and representations used in the physical sciences.

  • Demonstrated research impact through publications, preprints, or open-source work (e.g., NeurIPS, ICLR, ICML, ACL, EMNLP, Scientific Journals).


Bonus Points For



  • Experience with multimodal fusion architectures and document-level understanding.

  • Knowledge of scientific document parsing (OCR, table extraction, figure-caption linking).

  • Familiarity with knowledge graph construction or reasoning systems for science.

  • Experience with noisy or heterogeneous real-world scientific data.

  • Collaborative mindset and passion for advancing AI in the physical sciences.


About Lila


Lila Sciences is the world’s first scientific superintelligence platform and autonomous lab for life, chemistry, and materials science.  We are pioneering a new age of boundless discovery by building the capabilities to apply AI to every aspect of the scientific method.  We are introducing scientific superintelligence to solve humankind's greatest challenges, enabling scientists to bring forth solutions in human health, climate, and sustainability at a pace and scale never experienced before. Learn more about this mission at  www.lila.ai


If this sounds like an environment you’d love to work in, even if you only have some of the experience listed below, we encourage you to apply.


Composition


We expect the base salary for this role to fall between $176,000–$304,000 USD per year, along with bonus potential and generous early equity. The final offer will reflect your unique background, expertise, and impact.


We’re All In


Lila Sciences is committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or Veteran status.


Information you provide during your application process will be handled in accordance with our Candidate Privacy Policy.


A Note to Agencies


Lila Sciences does not accept unsolicited resumes from any source other than candidates. The submission of unsolicited resumes by recruitment or staffing agencies to Lila Sciences or its employees is strictly prohibited unless contacted directly by Lila Science’s internal Talent Acquisition team. Any resume submitted by an agency in the absence of a signed agreement will automatically become the property of Lila Sciences, and Lila Sciences will not owe any referral or other fees with respect thereto.

Other facts

Tech stack
Machine Learning,NLP,Vision-Language Modeling,PyTorch,Hugging Face Transformers,Data Structures,Scientific Document Parsing,Knowledge Graph Construction,Reasoning Systems,Collaborative Mindset,AI,Multimodal Understanding,Data Extraction,Research Impact,Publication

About Lila Sciences

Lila Sciences is the world’s first scientific superintelligence platform and autonomous lab for life, chemistry, and materials science. We are building the foundation to apply AI to every aspect of the scientific method, enabling scientists to bring forth solutions in human health and sustainability at a pace and scale never experienced before.

Team size: 201-500 employees
LinkedIn: Visit
Industry: Technology, Information and Internet

What you'll do

  • Research and develop AI systems that extract and structure knowledge from diverse scientific sources. Collaborate with domain experts to align extracted data with real-world discovery workflows.

Ready to join Lila Sciences?

Take the next step in your career journey

Frequently Asked Questions

What does Lila Sciences pay for a ML Research Scientist I/II, Multimodal Data Extraction?

Lila Sciences offers a competitive compensation package for the ML Research Scientist I/II, Multimodal Data Extraction role. The salary range is USD 176k - 304k per year. Apply through Clera to learn more about the full compensation details.

What does a ML Research Scientist I/II, Multimodal Data Extraction do at Lila Sciences?

As a ML Research Scientist I/II, Multimodal Data Extraction at Lila Sciences, you will: research and develop AI systems that extract and structure knowledge from diverse scientific sources. Collaborate with domain experts to align extracted data with real-world discovery workflows..

Why join Lila Sciences as a ML Research Scientist I/II, Multimodal Data Extraction?

Lila Sciences is a leading Technology, Information and Internet company. The ML Research Scientist I/II, Multimodal Data Extraction role offers competitive compensation.

Is the ML Research Scientist I/II, Multimodal Data Extraction position at Lila Sciences remote?

The ML Research Scientist I/II, Multimodal Data Extraction position at Lila Sciences is based in Cambridge, Massachusetts, United States. Contact the company through Clera for specific work arrangement details.

How do I apply for the ML Research Scientist I/II, Multimodal Data Extraction position at Lila Sciences?

You can apply for the ML Research Scientist I/II, Multimodal Data Extraction position at Lila Sciences directly through Clera. Click the "Apply Now" button above to start your application. Clera's AI-powered platform will help match your profile with this opportunity and guide you through the application process. You can also learn more about Lila Sciences on their website.