ML Infrastructure Engineer
About this role
We are looking for an ML Infrastructure Engineer with 3+ years of experience to own and scale the training and inference stack at a fast-growing AI document processing platform. You'll be a strong generalist who understands the mechanics of how ML models work – from serving and monitoring to building robust data pipelines – and can improve inference performance, reliability, and cost efficiency. This is a high-impact IC role where you'll work closely with ML researchers to ensure models are deployed quickly and reliably, and that infrastructure is never a bottleneck for the products being served. The ideal candidate is AI-native from the get-go, comfortable with 1-to-3 node training and single-to-double node serving, and thrives in a fast-paced startup environment.
What you will be doing
Building and maintaining model serving infrastructure – improving inference speed, monitoring, and reliability to ensure it's never a bottleneck for customers
Setting up and improving training infrastructure for models ranging from 300M to 30B parameters across 1-to-3 node environments
Developing observability, logging, and monitoring systems across the ML stack
Building internal data pipelines and tooling to help ML researchers move faster from experiment to production
Architecting infrastructure to arbitrate inference between multiple cloud providers while optimizing for accuracy, latency, and cost
Company at a glance
Reducto offers an API-style platform that converts complex documents into inputs for large language models, serving hundreds of customers from startups to Fortune 10 and processing tens of millions of pages monthly from its San Francisco headquarters.
Culture & values
Know someone who'd be great for this?

