Clera - Your AI talent agent
LoginStart
Start
JPMorganChase logo
JPMorganChase

Lead Site Reliability Engineer, AI/ML Platform

full-time•Jersey City

Summary

Location

Jersey City

Type

full-time

Experience

5-10 years

Company links

WebsiteLinkedInLinkedIn

About this role

Description

 Responsibilities:

  • Design and implement solutions to enhance the reliability and scalability of AI/ML platforms and applications to accommodate fast growing demands.
  • Partner with product engineering teams to ensure the AI/ML systems are reliable and high performing. 
  • Develop observability, security, automation and fin-ops tools and orchestration.
  • Provide strategic technology leadership by defining and evaluating standards and architecture for reliability, observability and automation frameworks.
  • Build strong cross-functional relationships that foster engagements across the organization and deliver solutions to user problems.
  • Debug and solve issues in a production environment, identify root cause and remediate. 
  • Participates in on-call rotations, incident management and escalation workflows.
  • Take full ownership of problems, develop solutions, and acquire new knowledge to complete the task.
  • Mentor and guide junior engineers.

Required Qualifications:

  • Bachelor’s degree in computer science, Information Technology, or equivalent technical qualification with 5+ years professional experience.
  • Expertise in SRE principles, reliability, scalability and performance of application and infrastructure.
  • Have hands-on experience with cloud platforms (AWS, GCP, Azure) and IaC tools (Terraform, Ansible). 
  • Extensive experience implementing advanced observability using tools like Open Telemetry, Dynatrace, Grafana, and/or cloud-native services.
  • Experience in architecting distributed systems and cloud-native architecture in AWS.
  • Systematic problem-solving and troubleshooting skills in a complex system.
  • Excellent communication skills and ability to represent and present business and technical concepts to stakeholders. 
  • Self-managed, self-motivated with strong sense of ownership, urgency, and drive

Good to have:

  • Prior experience working in AI, ML, or Data engineering.
  • Prior experience developing AI Ops/AI Agents.
  • Multi cloud experience (AWS, GCP, Azure) is a plus 


What you'll do

  • The Lead Site Reliability Engineer will design and implement solutions to enhance the reliability and scalability of AI/ML platforms. They will partner with product engineering teams and provide strategic technology leadership while mentoring junior engineers.

About JPMorganChase

With a history tracing its roots to 1799 in New York City, JPMorganChase is one of the world's oldest, largest, and best-known financial institutions—carrying forth the innovative spirit of our heritage firms in global operations across 100 markets. We serve millions of customers and many of the world’s most prominent corporate, institutional, and government clients daily, managing assets and investments, offering business advice and strategies, and providing innovative banking solutions and services. Social Media Terms and Conditions: https://bit.ly/JPMCSocialTerms JPMorgan Chase & Co. is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or status as a protected veteran.

Ready to join JPMorganChase?

Take the next step in your career journey

Frequently Asked Questions

What does a Lead Site Reliability Engineer, AI/ML Platform do at JPMorganChase?

Toggle
As a Lead Site Reliability Engineer, AI/ML Platform at JPMorganChase, you will: the Lead Site Reliability Engineer will design and implement solutions to enhance the reliability and scalability of AI/ML platforms. They will partner with product engineering teams and provide strategic technology leadership while mentoring junior engineers..

Is the Lead Site Reliability Engineer, AI/ML Platform position at JPMorganChase remote?

Toggle
The Lead Site Reliability Engineer, AI/ML Platform position at JPMorganChase is based in Jersey City, New Jersey, United States. Contact the company through Clera for specific work arrangement details.

How do I apply for the Lead Site Reliability Engineer, AI/ML Platform position at JPMorganChase?

Toggle
You can apply for the Lead Site Reliability Engineer, AI/ML Platform position at JPMorganChasedirectly through Clera. Click the "Apply Now" button above to start your application. Clera's AI-powered platform will help match your profile with this opportunity and guide you through the application process.
Clera - Your AI talent agent
© 2026 Clera Labs, Inc.TermsPrivacyHelp

Join Clera's Talent Pool

Get matched with similar opportunities at top startups

This role is hosted on JPMorganChase's careers site.
Join our talent pool first to get notified about similar roles that match your profile.