AstraZeneca logo
Data Science / AI Intern – Literature Mining & Graph Modeling
internshipWaltham$0k - $0k

Summary

Location

Waltham

Salary

$0k - $0k

Type

internship

Explore Jobs

About this role

AstraZeneca is seeking Master’s and PhD students studying Biology, Computer Science, Chemistry, Physics, Engineering, Biomedical Science, Pharmacology, Data Science, Bioinformatics, or a related discipline for a 10-week internship role at our site in Waltham, MA from June 01, 2026- August 07, 2026.  This internship sits at the intersection of data engineering, biomedical NLP, and translational science, enabling faster insight generation for R&D teams. 

Position Description:

  • Build an end-to-end pipeline turning literature (papers, abstracts, patents) into a standardized knowledge graph with contextualized evidence.
  • Handle source selection, inclusion/exclusion criteria, updates, and data snapshots.
  • Develop NLP for entity recognition, relation extraction, assertion detection, and context tagging (drug, indication, resistance, biomarker, outcome).
  • Encode domain relations (e.g., Drug–mechanism→Gene/Pathway; Biomarker–modulates→Outcome; ADC–targets→Antigen).
  • Map entities to controlled vocabularies; manage synonyms, disambiguation, and canonical IDs.
  • Implement edge-level confidence scoring (source quality, claim type, co-occurrence, citations, model certainty) with full evidence provenance.
  • Build graph storage (property graph or RDF) and queryable APIs.
  • Deliver interactive visualization (UI or notebook) with filters, context toggles, and evidence drill-down.
  • Define metrics, run error analyses, and validate with scientific stakeholders.
  • Ensure reproducibility and documentation: version models/data; record architecture, assumptions, benchmarks; provide user guides.
  • Present outcomes to data science, oncology, and translational medicine teams.

Position Requirements:      

  • Master’s and PhD students studying Biology, Computer Science, Chemistry, Physics, Engineering, Biomedical Science, Pharmacology, Data Science, Bioinformatics, or a related discipline.
  • Candidates must have an expected graduation date after August 2026.
  • US Work Authorization is required at time of application.
  • This role will not be providing OPT support.
  • NLP and ML: NER, relation extraction, transformers; Python-based workflows.
  • Graph/data modeling: experience with Neo4j, NetworkX, or RDF/SPARQL.
  • Domain knowledge: genes, pathways, biomarkers, therapeutic modalities (incl. ADCs) preferred.
  • Reproducibility: version control, environment management, documentation.
  • Soft skills: problem-solving, communication, collaboration.
  • Tech stack: Python (spaCy, Hugging Face), scikit-learn; PyTorch or TensorFlow.
  • Data & viz: pandas; PySpark or Dask; Plotly/Dash, D3.js, Neo4j Bloom.
  • Dev practices: Git, Conda/Poetry, Docker, experiment tracking.
  • Ability to report onsite to Waltham, MA site 3-5 days per week.
  • This role will not provide relocation assistance.
  • Compensation range: $41-$48 per hour

Date Posted

28-Jan-2026

Closing Date

12-Feb-2026

Our mission is to build an inclusive environment where equal employment opportunities are available to all applicants and employees. In furtherance of that mission, we welcome and consider applications from all qualified candidates, regardless of their protected characteristics. If you have a disability or special need that requires accommodation, please complete the corresponding section in the application form.

Other facts

Tech stack
NLP,Machine Learning,Data Engineering,Graph Modeling,Biomedical NLP,Python,Entity Recognition,Relation Extraction,Context Tagging,Neo4j,Version Control,Documentation,Problem Solving,Communication,Collaboration,Data Visualization

About AstraZeneca

We're transforming the future of healthcare by unlocking the power of what science can do for people, society and the planet. For more information, visit www.astrazeneca.com.

Community Guidelines: bit.ly/2MgAcio

Team size: 10,001+ employees
LinkedIn: Visit
Industry: Pharmaceutical Manufacturing

What you'll do

  • The intern will build an end-to-end pipeline to turn literature into a standardized knowledge graph and develop NLP for various tasks such as entity recognition and relation extraction. They will also ensure reproducibility and documentation of their work.

Ready to join AstraZeneca?

Take the next step in your career journey

Frequently Asked Questions

What does AstraZeneca pay for a Data Science / AI Intern – Literature Mining & Graph Modeling?

AstraZeneca offers a competitive compensation package for the Data Science / AI Intern – Literature Mining & Graph Modeling role. The salary range is USD 0k - 0k per year. Apply through Clera to learn more about the full compensation details.

What does a Data Science / AI Intern – Literature Mining & Graph Modeling do at AstraZeneca?

As a Data Science / AI Intern – Literature Mining & Graph Modeling at AstraZeneca, you will: the intern will build an end-to-end pipeline to turn literature into a standardized knowledge graph and develop NLP for various tasks such as entity recognition and relation extraction. They will also ensure reproducibility and documentation of their work..

Why join AstraZeneca as a Data Science / AI Intern – Literature Mining & Graph Modeling?

AstraZeneca is a leading Pharmaceutical Manufacturing company. The Data Science / AI Intern – Literature Mining & Graph Modeling role offers competitive compensation.

Is the Data Science / AI Intern – Literature Mining & Graph Modeling position at AstraZeneca remote?

The Data Science / AI Intern – Literature Mining & Graph Modeling position at AstraZeneca is based in Waltham, Massachusetts, United States. Contact the company through Clera for specific work arrangement details.

How do I apply for the Data Science / AI Intern – Literature Mining & Graph Modeling position at AstraZeneca?

You can apply for the Data Science / AI Intern – Literature Mining & Graph Modeling position at AstraZeneca directly through Clera. Click the "Apply Now" button above to start your application. Clera's AI-powered platform will help match your profile with this opportunity and guide you through the application process. You can also learn more about AstraZeneca on their website.