AfterQuery is a research lab investigating the boundaries and capabilities of artificial intelligence through novel datasets and experimentation. Their customers are the frontier AI labs; every major foundation model lab is a client. Backed by Y Combinator, BoxGroup, ex-partners from Lightspeed and Index Ventures, and senior leadership at Google DeepMind and Meta GenAI, AfterQuery was one of the fastest-growing YC companies in their batch.
The founding team brings backgrounds from Jane Street, Meta, Citadel Securities, Google, Goldman Sachs, Morgan Stanley, Silver Lake, Berkeley AI Research (BAIR), and Stanford AI Lab (SAIL).
About the Role
As a Senior Software Engineer, Infrastructure and Platform at AfterQuery, you will design and build the core infrastructure that powers their data generation, evaluation, and agentic systems. You will be responsible for the shared platforms that enable engineers and research teams to run large-scale human-in-the-loop workflows, evaluation harnesses, and automated data pipelines used to train frontier AI models.
This is a highly technical role with broad ownership. You will architect and build foundational infrastructure that many other engineers depend on, ensuring systems are scalable, reliable, and capable of supporting extremely high-throughput workloads. You will work directly with the founding team to define system architecture, establish engineering best practices, and build the infrastructure that supports the next generation of AI development.
Key Responsibilities
Architect and develop the shared infrastructure powering data generation platforms, human-in-the-loop systems, and evaluation pipelines
Build systems capable of processing large-scale datasets and high-throughput workloads with strong reliability guarantees
Create reusable infrastructure and APIs that enable product engineers and researchers to build quickly and reliably on top of core systems
Design systems with strong observability, monitoring, and fault tolerance to support production workloads at scale
Help define long-term system architecture across data pipelines, compute infrastructure, task orchestration, and storage systems
Work closely with engineers and researchers to support new AI experimentation workflows and platform capabilities
Define standards for system design, deployment, reliability, and infrastructure operations
Requirements
Must-Have
5+ years of experience building production distributed systems or platform infrastructure
Proficiency in Python and/or JavaScript (Node.js/Next.js) or similar backend technologies
Experience designing and operating systems in cloud environments (GCP or AWS)
Experience with message queues and event-driven systems (Kafka, RabbitMQ, Pub/Sub, or equivalent)
Experience working with high-throughput data pipelines and asynchronous processing systems
Strong understanding of system scalability, performance, and reliability
Experience owning systems running in production environments at scale
Background at a top-tier startup or FAANG-equivalent, this is a strong signal of fit
Nice-to-Have
Experience building internal developer platforms or shared infrastructure
Experience supporting large-scale data processing pipelines
Experience with AI infrastructure, LLM evaluation systems, or ML pipelines
Experience working at high-growth startups or scaling early infrastructure
Experience designing human-in-the-loop or workflow orchestration systems
Take the next step in your career journey