Jobs at Inferact (Now Hiring) — 10 open
Member of Technical Staff, TPU Performance Engineering
Singapore, Singapore · On-site
$200k–$400k/yr
SeniorVisa sponsorship$150M raised
Inferact's mission is to grow vLLM as the world's AI inference engine and accelerate AI progress by making inference cheaper and faster. Founded by the creators and core maintainers of vLLM, we sit at the intersection of…
Skills: TPU Performance Engineering, JAX, XLA, Pallas, ML Kernel Optimization
Member of Technical Staff, AMD GPU Performance Engineering
San Francisco, California, United States · Remote OK
$200k–$400k/yr
SeniorVisa sponsorship$150M raised
Inferact's mission is to grow vLLM as the world's AI inference engine and accelerate AI progress by making inference cheaper and faster. Founded by the creators and core maintainers of vLLM, we sit at the intersection of…
Skills: ROCm, HIP, Triton, CK, AITER
Member of Technical Staff, AMD GPU Performance Engineering
Singapore, Singapore · On-site
$200k–$400k/yr
SeniorVisa sponsorship$150M raised
Inferact's mission is to grow vLLM as the world's AI inference engine and accelerate AI progress by making inference cheaper and faster. Founded by the creators and core maintainers of vLLM, we sit at the intersection of…
Skills: AMD GPU Optimization, TPU Performance Engineering, ROCm, HIP, Triton
Member of Technical Staff, Kernel Engineering
Singapore, Singapore · On-site
$200k–$400k/yr
SeniorVisa sponsorship$150M raised
Inferact's mission is to grow vLLM as the world's AI inference engine and accelerate AI progress by making inference cheaper and faster. Founded by the creators and core maintainers of vLLM, we sit at the intersection of…
Skills: CUDA, C++, Python, GPU Architecture, Kernel Optimization
Member of Technical Staff, Inference
Singapore, Singapore · On-site
$200k–$400k/yr
SeniorVisa sponsorship$150M raised
Inferact's mission is to grow vLLM as the world's AI inference engine and accelerate AI progress by making inference cheaper and faster. Founded by the creators and core maintainers of vLLM, we sit at the intersection of…
Skills: Python, PyTorch, vLLM, TensorRT-LLM, SGLang
Member of Technical Staff, Performance and Scale
Singapore, Singapore · On-site
$200k–$400k/yr
SeniorVisa sponsorship$150M raised
Inferact's mission is to grow vLLM as the world's AI inference engine and accelerate AI progress by making inference cheaper and faster. Founded by the creators and core maintainers of vLLM, we sit at the intersection of…
Skills: Rust, Go, C++, Distributed Systems, Network Protocols
Member of Technical Staff, Cloud Orchestration
Singapore, Singapore · On-site
$200k–$400k/yr
SeniorVisa sponsorship$150M raised
Inferact's mission is to grow vLLM as the world's AI inference engine and accelerate AI progress by making inference cheaper and faster. Founded by the creators and core maintainers of vLLM, we sit at the intersection of…
Skills: Kubernetes, Container Orchestration, Kubernetes Operators, Python, Rust
Member of Technical Staff, Inference
San Francisco, California, United States · Remote OK
$200k–$400k/yr
SeniorVisa sponsorship$150M raised
Inferact's mission is to grow vLLM as the world's AI inference engine and accelerate AI progress by making inference cheaper and faster. Founded by the creators and core maintainers of vLLM, we sit at the intersection of…
Skills: Python, PyTorch, vLLM, TensorRT-LLM, SGLang
Member of Technical Staff, Developer Relations
San Francisco, California, United States · Remote OK
$200k–$400k/yr
SeniorVisa sponsorship$150M raised
Overview Inferact's mission is to grow vLLM as the world's AI inference engine and accelerate AI progress by making inference cheaper and faster. Founded by the creators and core maintainers of vLLM, we sit at the inters…
Skills: LLM Inference Systems, Developer Relations, GPU Serving, Technical Writing, Model Serving
Member of Technical Staff, TPU Performance Engineering
San Francisco, California, United States · Remote OK
$200k–$400k/yr
SeniorVisa sponsorship$150M raised
Inferact's mission is to grow vLLM as the world's AI inference engine and accelerate AI progress by making inference cheaper and faster. Founded by the creators and core maintainers of vLLM, we sit at the intersection of…
Skills: AMD GPU Optimization, TPU Performance Engineering, ROCm, HIP, Triton