Software Engineer - RL Environments

San Francisco +2 · On-site$180k – $220kVisa Sponsorship Available

About this role

About AfterQuery

AfterQuery builds the training data and evaluation infrastructure that frontier AI labs use to improve their models. We work with the world's leading labs to design high signal datasets and run rigorous evaluations that go beyond static benchmarks. We are a small, early team (post Series A) where individual contributors have a direct impact on how the next generation of models learns and improves.

The Role

As a SWE (Environments), you will design the datasets and evaluation rubrics that directly influence how frontier models learn. You'll work hands-on with research teams at top AI labs, experimenting with data-collection strategies, diagnosing model failure modes, and developing metrics to determine whether a model is actually improving. You'll go from hypothesis to live experiment quickly, and your output will feed directly into model training runs at scale.

Day to day, you will design data slices that expose meaningful failure modes across domains like finance, code, and enterprise workflows. You will build and refine reward signals for RLHF and RLVR pipelines. You will develop quantitative frameworks for measuring dataset quality, diversity, and downstream impact on alignment and capability. You will partner with lab research teams to translate their training objectives into concrete data and evaluation specifications.

Compensation is $200K base plus profit share of roughly 150% of base, bringing expected total cash to around $500K, plus competitive equity.

What You'll Do

Design data slices and explore data shapes that expose meaningful model failure modes across domains like finance, code, and enterprise workflows

Build and refine evaluation rubrics and reward signals for RLHF and RLVR training pipelines

Model annotator behavior and run experiments to improve different model capabilities

Develop quantitative frameworks for measuring dataset quality, diversity, and downstream impact on model alignment and capability

Create and manage both real-world and synthetic data pipelines

Partner with lab research teams to translate their training objectives into concrete data and evaluation specifications

Requirements

Must-Have

1–4 years of software engineering experience with strong technical depth

Genuine obsession with how data structure, selection, and quality drive model behavior

Ability to design lightweight experiments, move fast, and extract actionable insights from messy results

Comfort working across domains, finance, software engineering, policy, and more

Track record of shipping, bias toward building, not theorizing

Nice-to-Have

Prior work or internship at an RL environment company, AI safety org, or benchmarking org (METR, Artificial Analysis, or equivalent)

Former founder or early engineer at an early-stage startup

Experience building data pipelines (real-world + synthetic)

Familiarity with RLHF / RLVR training pipelines

AfterQuery builds datasets and experimentation to advance frontier LLM and AI-agent workflows, constructing complex data-infrastructure for agentic and hard-reasoning tasks. It partners with five AI labs and is YC's infrastructure partner, led by teams from top banks and quant firms.

IndustryData Infrastructure and Analytics

Team Size11-50

WorkspaceOn-site

StageSeries A

Founded2025

Locations

San Francisco, CA, United States ·Los Angeles, CA, United States ·New York City, NY, United States

Investors

Altos Ventures ·BoxGroup ·Latitude Capital ·Raine Ventures ·Y Combinator

Websiteafterquery.com

LinkedInLinkedIn

Culture & values

There is a strong emphasis on experimentation and novel datasets to push the frontier of LLMs and AI agents.

The work culture centers on solving complex, frontier-scale technical challenges in data infrastructure for agentic and hard reasoning workflows.

The company collaborates with all five leading AI labs, indicating a collaborative, cross-organizational culture.

The company is experiencing a sharp hockey-stick growth rate, suggesting a fast-paced, high-velocity work environment.

AfterQueryBacked byAltos Ventures

30 open roles on AfterQuery

Software Engineer - RL Environments

San Francisco +2 · On-site$180k – $220kVisa Sponsorship Available

About this role

About AfterQuery

The Role

Compensation is $200K base plus profit share of roughly 150% of base, bringing expected total cash to around $500K, plus competitive equity.

What You'll Do

Design data slices and explore data shapes that expose meaningful model failure modes across domains like finance, code, and enterprise workflows

Build and refine evaluation rubrics and reward signals for RLHF and RLVR training pipelines

Model annotator behavior and run experiments to improve different model capabilities

Develop quantitative frameworks for measuring dataset quality, diversity, and downstream impact on model alignment and capability

Create and manage both real-world and synthetic data pipelines

Partner with lab research teams to translate their training objectives into concrete data and evaluation specifications

Requirements

Must-Have

1–4 years of software engineering experience with strong technical depth

Genuine obsession with how data structure, selection, and quality drive model behavior

Ability to design lightweight experiments, move fast, and extract actionable insights from messy results

Comfort working across domains, finance, software engineering, policy, and more

Track record of shipping, bias toward building, not theorizing

Nice-to-Have

Prior work or internship at an RL environment company, AI safety org, or benchmarking org (METR, Artificial Analysis, or equivalent)

Former founder or early engineer at an early-stage startup

Experience building data pipelines (real-world + synthetic)

Familiarity with RLHF / RLVR training pipelines

IndustryData Infrastructure and Analytics

Team Size11-50

WorkspaceOn-site

StageSeries A

Founded2025

Locations

San Francisco, CA, United States ·Los Angeles, CA, United States ·New York City, NY, United States

Investors

Altos Ventures ·BoxGroup ·Latitude Capital ·Raine Ventures ·Y Combinator

Websiteafterquery.com

LinkedInLinkedIn

About the Team

Team Distribution

Operations39%
Engineering35%
Sales12%
Other9%
Leadership3%

Where the Team Studied

1.University of Pennsylvania
2.University of California, Berkeley
3.Georgetown University
4.Harvard University
5.Duke University

Team Worked At

Meta
Google
Goldman Sachs
PwC
Citadel Securities

Culture & values

There is a strong emphasis on experimentation and novel datasets to push the frontier of LLMs and AI agents.

The work culture centers on solving complex, frontier-scale technical challenges in data infrastructure for agentic and hard reasoning workflows.

The company collaborates with all five leading AI labs, indicating a collaborative, cross-organizational culture.

The company is experiencing a sharp hockey-stick growth rate, suggesting a fast-paced, high-velocity work environment.

Funding History

Cumulative Funding

Valuation

Life at AfterQuery

Know someone who'd be great for this?

Tools

Explore

Company

Tools

Explore

Company

Software Engineer - RL Environments

About this role

About AfterQuery

The Role

What You'll Do

Must-Have

Nice-to-Have

Company at a glance

Culture & values

Tools

Explore

Company

Tools

Explore

Company

Software Engineer - RL Environments

About this role

About AfterQuery

The Role

What You'll Do

Must-Have

Nice-to-Have

Company at a glance

About the Team

Team Distribution

Where the Team Studied

Team Worked At

Culture & values

Funding History

Life at AfterQuery

Tools

Explore

Company

About the Team

Team Distribution

Where the Team Studied

Team Worked At

Funding History

Life at AfterQuery