
We are looking for members of technical staff specializing in ML. We’re particularly interested in self-motivated researchers and engineers who want to meaningfully contribute towards training powerful models, no matter whether that means working on low level GPU optimizations or new optimization theory.
Relevant Skills (not all are necessary):
Work on large language models at a lab (e.g. OpenAI, Google, Mistral, Z.ai, Qwen, Deepseek, Ai2, or academic).
Experience with pretraining language models and large scale AI infrastructure, can include any of the following:
Contributions to community initiatives such as NanoGPT Speedrun, Marin, etc.
Understanding of different types of model parallelism.
Software/hardware codesign to maximize training throughput.
Experience with monitoring and maintaining large scale training runs.
Academic works (e.g. papers on optimization, data, etc).
Experience with post-training language models
Work on reinforcement learning for language models (environments, infrastructure, training).
Academic or personal work on instruction data curation, tool use, or generally post-training related tasks.
Experience with inference/systems optimization
Contributions to vLLM, SGLang, Dynamo, MegaKernels, etc
Strong systems level understand of these frameworks and how to optimize for batching, KV cache pressure, long context, etc
Experience with low level kernel design + DSLs
CUDA, C++, CuTE, Triton, PTX, TileLang, etc
Take the next step in your career journey