Senior Machine Learning Engineer - Foundation Model

full-time•Santa Clara•$174k - $295k

Summary

Location

Santa Clara

Salary

$174k - $295k

Type

full-time

Experience

5-10 years

Company links

Website LinkedIn

About this role

<div data-page-id="JtHmdFPBFoDeeNxg3hrc4Xyjnhg" data-docx-has-block-data="false"> <div class="ace-line ace-line old-record-id-X1AxdI3wHoDJuhxEQzscIQfNnxb"> <div data-page-id="IlWVdPHa4oaUIgxAp1KccvMknZZ" data-docx-has-block-data="false"> <div class="ace-line ace-line old-record-id-TMEXdNB2hoez9lx0yNEcRvhgnht"> <div data-page-id="J9QXdzY9Vo8oQhxxdaice9A6nfb" data-lark-html-role="root" data-docx-has-block-data="false"> <div class="ace-line ace-line old-record-id-doxcnDGT4DThW8i3iruWP8K7kLh"> <div data-page-id="UUExdPHmlod9ESxE22jci10hnre" data-lark-html-role="root" data-docx-has-block-data="false"> <div class="ace-line ace-line old-record-id-doxcnhMxbX0qxDrUNDwdnAJqfae"><span class="text-only" data-eleid="3"><span class="text-only"><strong>XPENG</strong> is a leading smart technology company at the forefront of innovation, integrating advanced </span><span class="text-only text-with-abbreviation text-with-abbreviation-bottomline">AI</span><span class="text-only"> and autonomous driving technologies into its vehicles, including electric vehicles (EVs), electric vertical take-off and landing (</span><span class="text-only text-with-abbreviation text-with-abbreviation-bottomline">eVTOL</span><span class="text-only">) aircraft, and robotics. With a strong focus on intelligent mobility, XPENG is dedicated to reshaping the future of transportation through cutting-edge R&D in AI, </span><span class="text-only text-with-abbreviation text-with-abbreviation-bottomline">machine learning</span><span class="text-only">, and smart connectivity.</span></span></div> <div class="ace-line ace-line old-record-id-doxcn0WjJ1LeUCCB4opOIYhCPHt"> </div> <div class="ace-line ace-line old-record-id-doxcnIOW2qqhop1BphDbxJ4Q10e"> <div data-page-id="I99hd2hr8oOpvQxsoSVcGs43nuc" data-lark-html-role="root" data-docx-has-block-data="false"> <div class="ace-line ace-line old-record-id-Q9pid1HseoiUi2x7vYRckbWzntf">We are looking for a full-time <strong>Machine Learning Engineer</strong><strong> / Research Scientist</strong> to drive the modeling and algorithmic development of XPENG’s next-generation <strong>Vision-Language-Action (VLA) Foundation Model</strong> — the core brain that powers our end-to-end autonomous driving systems.</div> <div class="ace-line ace-line old-record-id-BzJQdkAw8ophizxiEG2cd2txnUh">You will work closely with world-class researchers, perception and planning engineers, and infrastructure experts to design, train, and deploy large-scale multi-modal models that unify vision, language, and control. Your work will directly shape the intelligence that enables XPENG’s future L3/L4 autonomous driving products.</div> <h4 class="heading-4 ace-line old-record-id-I3g7dxX6ToiA5IxTKHxc6TK4nTe"><strong>Key Responsibilities</strong></h4> <ul class="list-bullet1"> <li class="ace-line ace-line old-record-id-LfgcdoOsmoqqAmx4nkMciD48nlf" data-list="bullet"> <div>Design and implement <strong>large-scale multi-modal architectures</strong> (e.g., vision–language–action transformers) for end-to-end autonomous driving.</div> </li> <li class="ace-line ace-line old-record-id-EL5mdaYcso3LaWxKzu9ckURtnZf" data-list="bullet"> <div>Develop <strong>pretraining and fine-tuning strategies</strong> leveraging massive labeled and unlabeled fleet data (images, video, LiDAR, CAN bus, maps, human driving behaviors, etc.).</div> </li> <li class="ace-line ace-line old-record-id-UMRhdW8qSosU5lxI0rtcshGEnDe" data-list="bullet"> <div>Research and integrate <strong>cross-modal alignment</strong> (e.g., visual grounding, temporal reasoning, policy distillation, imitation and reinforcement learning) to improve model interpretability and action quality.</div> </li> <li class="ace-line ace-line old-record-id-Q9UedlXT4o6DNaxjJQbctuNCnFc" data-list="bullet"> <div>Collaborate with infrastructure engineers to <strong>scale training across thousands of GPUs</strong> using distributed training frameworks (FSDP, DDP, etc.).</div> </li> <li class="ace-line ace-line old-record-id-KA81dJBfXo95Rexpz3Bcoub0nIe" data-list="bullet"> <div>Conduct <strong>systematic ablation, evaluation, and visualization</strong> of model behavior across perception, reasoning, and planning tasks.</div> </li> <li class="ace-line ace-line old-record-id-Uhkndcg0XoTCYZxyp7tc9jcincg" data-list="bullet"> <div>Contribute to <strong>model deployment </strong><strong>optimization</strong>, including quantization, export, and latency–accuracy trade-offs for onboard execution.</div> </li> </ul> <h4 class="heading-4 ace-line old-record-id-BjuXdsOuFo90i9x81HrcxHE6nHb"><strong>Minimum Qualifications</strong></h4> <ul class="list-bullet1"> <li class="ace-line ace-line old-record-id-GWJCdGtyZo62agxDF9ic1pAunWb" data-list="bullet"> <div>Master’s degree or higher in <strong>Computer Science, Electrical/Computer Engineering, or related field</strong>, with <strong>3+ years of experience</strong> in deep learning research or productization.</div> </li> <li class="ace-line ace-line old-record-id-TRZ9dN7SFosnNWx0M5ycaabVnje" data-list="bullet"> <div>Strong proficiency in <strong>PyTorch</strong> and modern transformer-based model design.</div> </li> <li class="ace-line ace-line old-record-id-Nu6rdopdVobt2ExwoXUc9Yiln0d" data-list="bullet"> <div>Experience in <strong>large-scale pretraining</strong> or <strong>multi-modal modeling</strong> (vision, language, or planning).</div> </li> <li class="ace-line ace-line old-record-id-QPtAdtZHXo0nhXxDHYIcejAmn8f" data-list="bullet"> <div>Deep understanding of <strong>representation learning, temporal modeling</strong>, and <strong>self-supervised or </strong><strong>reinforcement learning</strong> techniques.</div> </li> <li class="ace-line ace-line old-record-id-TK2xdWK88o0DFPxwRATcR8kWnFf" data-list="bullet"> <div>Familiarity with <strong>distributed training</strong> (DDP, FSDP) and large-batch optimization.</div> </li> </ul> <h4 class="heading-4 ace-line old-record-id-Rh2tdIEM6o8m5dxfL3QcxssNnCg"><strong>Preferred Qualifications</strong></h4> <ul class="list-bullet1"> <li class="ace-line ace-line old-record-id-LjBPdPlwHoXfRMxvpcpcvYJ8n1b" data-list="bullet"> <div>PhD in <strong>CS/CE/EE</strong> or related field, with 1+ years of relevant industry experience.</div> </li> <li class="ace-line ace-line old-record-id-Ghq0dQsd7oA0Y5xfNZucCodFnPd" data-list="bullet"> <div>Publication record in top-tier AI conferences (CVPR, ICCV, NeurIPS, ICLR, ICML, ECCV).</div> </li> <li class="ace-line ace-line old-record-id-JVPKdx3UroA95ZxLwyqc8WHOnUe" data-list="bullet"> <div>Prior experience building <strong>foundation or end-to-end driving models</strong>, or <strong>LLM</strong><strong>/VLM architectures</strong> (e.g., ViT, Flamingo, BEVFormer, RT-2, or GRPO-style policies).</div> </li> <li class="ace-line ace-line old-record-id-BtradE08OoFGSMx5ixBckaK4nhd" data-list="bullet"> <div>Familiarity with <strong>RLHF/DPO/GRPO</strong>, <strong>trajectory prediction</strong>, or <strong>policy learning</strong> for control tasks.</div> </li> <li class="ace-line ace-line old-record-id-VA83dmldRoE48ixy00dcZwCunWg" data-list="bullet"> <div>Proven ability to collaborate cross-functionally with infra, perception, and planning teams to deliver production-ready models.</div> </li> </ul> </div> </div> <div class="ace-line ace-line old-record-id-doxcnWqThAnEQca7pIOB1g2aaac"><strong>What do we provide:</strong></div> <ul class="list-bullet1"> <li class="ace-line ace-line old-record-id-doxcn2tTL21UFBcLw8JU62QasmT" data-list="bullet"> <div data-page-id="I99hd2hr8oOpvQxsoSVcGs43nuc" data-lark-html-role="root" data-docx-has-block-data="false"> <div data-page-id="I99hd2hr8oOpvQxsoSVcGs43nuc" data-lark-html-role="root" data-docx-has-block-data="false"> <div class=" old-record-id-Dk36dgYfIoFfeNxmHyjcotdGnQb">A collaborative, research-driven environment with access to <strong>massive real-world data</strong> and <strong>industry-scale compute.</strong></div> </div> </div> </li> <li class="ace-line ace-line old-record-id-doxcn2tTL21UFBcLw8JU62QasmT" data-list="bullet"> <div>An opportunity to work with <strong>top-tier researchers and engineers</strong> advancing the frontier of foundation models for autonomous driving.</div> </li> <li class="ace-line ace-line old-record-id-doxcnAYmiELhfcOJwEn185hDFXb" data-list="bullet">Direct impact on the next generation of <strong>intelligent mobility systems</strong>.</li> <li class="ace-line ace-line old-record-id-doxcn11ZLbFiyzakhCeUfhRNOle" data-list="bullet"> <div>Opportunity to make significant impact on the transportation revolution by the means of advancing autonomous driving.</div> </li> <li class="ace-line ace-line old-record-id-doxcn2F7yt4vBaxGMV2dAuaEJGg" data-list="bullet"> <div>Competitive compensation package.</div> </li> <li class="ace-line ace-line old-record-id-doxcnulPTIh3eRc8oE0kLVrMK6g" data-list="bullet"> <div>Snacks, lunches, dinners, and fun activities.</div> </li> </ul> <div class="ace-line ace-line old-record-id-doxcnPlltJXWUCkWt5zvPsSxMTb"> </div> <div class="ace-line ace-line old-record-id-doxcnxplnfy4ZiADGbAxSRRQH8f">The base salary range for this full-time position is $174,720 - $295,680, in addition to bonus, equity and benefits. Our salary ranges are determined by role, level, and location. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position across all US locations. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training.</div> <div class="ace-line ace-line old-record-id-doxcnQgtUrmlPVMJDIpGAJaYOXe"> </div> <div class="ace-line ace-line old-record-id-doxcnkb24Tyj7ahZIf73PiQGYgf">We are an Equal Opportunity Employer. It is our policy to provide equal employment opportunities to all qualified persons without regard to race, age, color, sex, sexual orientation, religion, national origin, disability, veteran status or marital status or any other prescribed category set forth in federal or state regulations.</div> </div> </div> </div> </div> </div> </div> </div>

What you'll do

Design and implement large-scale multi-modal architectures for autonomous driving. Collaborate with engineers to scale training and optimize model deployment.

About XPENG

XPeng is a leading Chinese Smart EV company that designs, develops, manufactures, and markets Smart EVs that appeal to the large and growing base of technology-savvy middle-class consumers. Its mission is to drive Smart EV transformation with technology and data, shaping the mobility experience of the future. In order to optimize its customers’ mobility experience, XPeng develops in-house its full-stack advanced driver-assistance system technology and in-car intelligent operating system, as well as core vehicle systems including powertrain and the electrical/electronic architecture. XPeng is headquartered in Guangzhou, China. In 2021, the Company established its European headquarters in Amsterdam, along with other dedicated offices in Copenhagen, Munich, Oslo, and Stockholm.The Company’s Smart EVs are mainly manufactured at its plant in Zhaoqing and Guangzhou，Guangdong province. For more information, please visit https://heyxpeng.com.

Ready to join XPENG?

Take the next step in your career journey

Frequently Asked Questions

What does XPENG pay for a Senior Machine Learning Engineer - Foundation Model?

XPENG offers a competitive compensation package for the Senior Machine Learning Engineer - Foundation Model role. The salary range is USD 175k - 296k per year. Apply through Clera to learn more about the full compensation details.

What does a Senior Machine Learning Engineer - Foundation Model do at XPENG?

As a Senior Machine Learning Engineer - Foundation Model at XPENG, you will: design and implement large-scale multi-modal architectures for autonomous driving. Collaborate with engineers to scale training and optimize model deployment..

Is the Senior Machine Learning Engineer - Foundation Model position at XPENG remote?

The Senior Machine Learning Engineer - Foundation Model position at XPENG is based in Santa Clara, California, United States. Contact the company through Clera for specific work arrangement details.

How do I apply for the Senior Machine Learning Engineer - Foundation Model position at XPENG?

You can apply for the Senior Machine Learning Engineer - Foundation Model position at XPENGdirectly through Clera. Click the "Apply Now" button above to start your application. Clera's AI-powered platform will help match your profile with this opportunity and guide you through the application process.

About this role

About XPENG

Senior Machine Learning Engineer - Foundation Model

Summary

Location

Salary

Type

Experience

Company links

About this role

What you'll do

About XPENG

Ready to join XPENG?

Frequently Asked Questions

What does XPENG pay for a Senior Machine Learning Engineer - Foundation Model?

What does a Senior Machine Learning Engineer - Foundation Model do at XPENG?

Is the Senior Machine Learning Engineer - Foundation Model position at XPENG remote?

How do I apply for the Senior Machine Learning Engineer - Foundation Model position at XPENG?

Senior Machine Learning Engineer - Foundation Model

Summary

Location

Salary

Type

Experience

Company links

About this role

What you'll do

About XPENG

Ready to join XPENG?

Frequently Asked Questions

What does XPENG pay for a Senior Machine Learning Engineer - Foundation Model?

What does a Senior Machine Learning Engineer - Foundation Model do at XPENG?

Is the Senior Machine Learning Engineer - Foundation Model position at XPENG remote?

How do I apply for the Senior Machine Learning Engineer - Foundation Model position at XPENG?

Join Clera's Talent Pool

Join Clera's Talent Pool