Lilly logo
Advisor, Data Scientist - CMC Data Products
full-timeIndianapolis$126k - $244k

Summary

Location

Indianapolis

Salary

$126k - $244k

Type

full-time

Explore Jobs

About this role

At Lilly, we unite caring with discovery to make life better for people around the world. We are a global healthcare leader headquartered in Indianapolis, Indiana. Our employees around the world work to discover and bring life-changing medicines to those who need them, improve the understanding and management of disease, and give back to our communities through philanthropy and volunteerism. We give our best effort to our work, and we put people first. We’re looking for people who are determined to make life better for people around the world.

Organizational & Position Overview:

The Bioproduct Research & Development (BR&D) organization strives to deliver creative medicines to patients by developing and commercializing insulins, monoclonal antibodies, novel therapeutic proteins, peptides, oligonucleotide therapies, and gene therapy systems. This multidisciplinary group works collaboratively with our discovery and manufacturing colleagues.

We are seeking an exceptional Data Scientist with deep data expertise in the pharmaceutical domain to lead the development and delivery of enterprise-scale data products that power AI-driven insights, process optimization, and regulatory compliance. In this role, you'll bridge pharmaceutical sciences with modern data engineering to transform complex CMC, PAT, and analytical data into strategic assets that accelerate drug development and manufacturing excellence.

Responsibilities:

Data Product Development: Define the roadmap and deliver analysis-ready and AI-ready data products that enable AI/ML applications, PAT systems, near-time analytical testing, and process intelligence across CMC workflows.

Data Archetypes & Modern Data Management: Define pharmaceutical-specific data archetypes (process, analytical, quality, CMC submission) and create reusable data models aligned with industry standards (ISA-88, ISA-95, CDISC, eCTD). 

Modern Data Management for Regulated Environments: Implement data frameworks that ensure 21 CFR Part 11, ALCOA+, and data integrity compliance, while enabling scientific innovation and self-service access. 

AI/ML-ready Data Products: Build training datasets for lab automation, process optimization, and predictive CQA models, and support generative AI applications for knowledge management and regulatory Q&A.

Cross-Functional Leadership: Collaborate with analytical R&D, process development, manufacturing science, quality, and regulatory affairs to standardize data products.

Deliverables include:

  • Scalable data integration platform that automates compilation of technical-review-ready and submission-ready data packages with demonstrable quality assurance.
  • Unified CMC data repository supporting current process and analytical method development while enabling future AI/ML applications across R&D and manufacturing
  • Data flow frameworks that enable self-service access while maintaining GxP compliance and audit readiness
  • Comprehensive documentation, standards, and training programs that democratize data access and accelerate product development

Basic Requirements:

  • Master’s degree in Computer Science, Data Science, Machine Learning, AI, or related technical field
  •  8+ years of product management experience focused on data products, data platforms, or scientific data systems and a strong grasp of modern data architecture patterns (data warehouses, data lakes, real-time streaming)
  • Knowledge of modern data stack technologies (Microsoft Fabric, Databricks, Airflow) and cloud platforms (AWS- S3, RDS, Lambda/Glue, Azure)
  • Demonstrated experience designing data products that support AI/ML workflows and advanced analytics in scientific domains
  • Proficiency with SQL, Python, and data visualization tools
  •  Experience with analytical instrumentation and data systems (HPLC/UPLC, spectroscopy, particle characterization, process sensors)
  •  Knowledge of pharmaceutical manufacturing processes, including batch and continuous manufacturing, unit operations, and process control
  • Expertise in data modeling for time-series, spectroscopic, chromatographic, and hierarchical batch/lot data
  • Experience with laboratory data management systems (LIMS, ELN, SDMS, CDS) and their integration patterns

Additional Preferences:

  • Understanding of Design of Experiments (DoE), Quality by Design (QbD), and process validation strategies
  • Experience implementing data mesh architectures in scientific organizations
  • Knowledge of MLOps practices and model deployment in validated environments
  • Familiarity with regulatory submissions (eCTD, CTD) and how analytical data supports marketing applications
  • Experience with CI/CD pipelines (GitHub Actions, CloudFormation) for scientific applications

Lilly is dedicated to helping individuals with disabilities to actively engage in the workforce, ensuring equal opportunities when vying for positions. If you require accommodation to submit a resume for a position at Lilly, please complete the accommodation request form (https://careers.lilly.com/us/en/workplace-accommodation) for further assistance. Please note this is for individuals to request an accommodation as part of the application process and any other correspondence will not receive a response.

Lilly is proud to be an EEO Employer and does not discriminate on the basis of age, race, color, religion, gender identity, sex, gender expression, sexual orientation, genetic information, ancestry, national origin, protected veteran status, disability, or any other legally protected status.


Our employee resource groups (ERGs) offer strong support networks for their members and are open to all employees. Our current groups include: Africa, Middle East, Central Asia Network, Black Employees at Lilly, Chinese Culture Network, Japanese International Leadership Network (JILN), Lilly India Network, Organization of Latinx at Lilly (OLA), PRIDE (LGBTQ+ Allies), Veterans Leadership Network (VLN), Women’s Initiative for Leading at Lilly (WILL), enAble (for people with disabilities). Learn more about all of our groups.

Actual compensation will depend on a candidate’s education, experience, skills, and geographic location.  The anticipated wage for this position is

$126,000 - $244,200

Full-time equivalent employees also will be eligible for a company bonus (depending, in part, on company and individual performance). In addition, Lilly offers a comprehensive benefit program to eligible employees, including eligibility to participate in a company-sponsored 401(k); pension; vacation benefits; eligibility for medical, dental, vision and prescription drug benefits; flexible benefits (e.g., healthcare and/or dependent day care flexible spending accounts); life insurance and death benefits; certain time off and leave of absence benefits; and well-being benefits (e.g., employee assistance program, fitness benefits, and employee clubs and activities).Lilly reserves the right to amend, modify, or terminate its compensation and benefit programs in its sole discretion and Lilly’s compensation practices and guidelines will apply regarding the details of any promotion or transfer of Lilly employees.

#WeAreLilly

Other facts

Tech stack
Data Science,Machine Learning,AI,Data Management,Data Products,SQL,Python,Data Visualization,Analytical Instrumentation,Pharmaceutical Manufacturing,Data Modeling,Laboratory Data Management,Cloud Platforms,Data Architecture,Process Optimization,Regulatory Compliance

About Lilly

We're a medicine company turning science into healing to make life better for people around the world. It all started nearly 150 years ago with a clear vision from founder Colonel Eli Lilly: "Take what you find here and make it better and better." Harnessing the power of biotechnology, chemistry and genetic medicine, our scientists are urgently advancing science to solve some of the world's most significant health challenges.

General Information and Guidelines:
When you engage with us on LinkedIn, you're agreeing to these Community Guidelines: https://e.lilly/guidelines.

If you have questions about a Lilly medicine, contact The Lilly Answers Center at 1-800-Lilly-Rx (1-800-545-5979) Monday through Friday, excluding company holidays.

Team size: 10,001+ employees
LinkedIn: Visit
Industry: Pharmaceutical Manufacturing

What you'll do

  • The Data Scientist will lead the development of enterprise-scale data products that enable AI-driven insights and process optimization. Responsibilities include defining data archetypes, implementing data frameworks for compliance, and collaborating with cross-functional teams.

Ready to join Lilly?

Take the next step in your career journey

Frequently Asked Questions

What does Lilly pay for a Advisor, Data Scientist - CMC Data Products?

Lilly offers a competitive compensation package for the Advisor, Data Scientist - CMC Data Products role. The salary range is USD 126k - 244k per year. Apply through Clera to learn more about the full compensation details.

What does a Advisor, Data Scientist - CMC Data Products do at Lilly?

As a Advisor, Data Scientist - CMC Data Products at Lilly, you will: the Data Scientist will lead the development of enterprise-scale data products that enable AI-driven insights and process optimization. Responsibilities include defining data archetypes, implementing data frameworks for compliance, and collaborating with cross-functional teams..

Why join Lilly as a Advisor, Data Scientist - CMC Data Products?

Lilly is a leading Pharmaceutical Manufacturing company. The Advisor, Data Scientist - CMC Data Products role offers competitive compensation.

Is the Advisor, Data Scientist - CMC Data Products position at Lilly remote?

The Advisor, Data Scientist - CMC Data Products position at Lilly is based in Indianapolis, Indiana, United States. Contact the company through Clera for specific work arrangement details.

How do I apply for the Advisor, Data Scientist - CMC Data Products position at Lilly?

You can apply for the Advisor, Data Scientist - CMC Data Products position at Lilly directly through Clera. Click the "Apply Now" button above to start your application. Clera's AI-powered platform will help match your profile with this opportunity and guide you through the application process. You can also learn more about Lilly on their website.