
Runpod, Inc.
Runpod, Inc., a software development company, is pioneering the future of AI and machine learning. Founded in 2022, Runpod is a rapidly growing, well-funded company with a remote-first organization. Runpod, Inc. currently has three open positions: * **Manager, Datacenter Network Engineering:** This remote position involves leading a team of network engineers responsible for designing, deploying, and operating Runpod’s global datacenter and backbone network. The salary range is $150,000 - $240,000 USD. Responsibilities include managing the Datacenter Networking Team, owning Datacenter Network Architecture, overseeing High-Performance GPU Networking, guiding Encapsulation & Overlay Protocols, leading Global WAN & Backbone Connectivity, ensuring Reliability & Operations, fostering Cross-Functional Collaboration, managing Vendor & Partner relationships, and ensuring Security & Segmentation. Requirements include 3+ years of experience managing network or infrastructure engineering teams, 8+ years designing and operating large-scale datacenter networks, strong hands-on experience with VXLAN/EVPN or equivalent encapsulation protocols, proven experience with InfiniBand and/or RoCE, deep familiarity with global WAN technologies, fluency in Linux and Network OS, a strong background in network observability, incident management, capacity forecasting, and change control, and clear written and verbal communication skills. * **Manager, HPC Storage Engineer:** Also a remote position, this role involves leading a team responsible for Runpod’s distributed storage infrastructure across all regions. The salary range is $150,000 - $240,000 USD. Responsibilities include owning Distributed Storage Architecture, building the Storage Engineering Team, overseeing High-Performance Shared Filesystems, leading Advanced Filesystems & Platforms, driving End-to-End Performance Ownership, evaluating Next-Generation Storage Technologies, ensuring Reliability & Scale, implementing Automation & Observability, fostering Cross-Functional Collaboration, and managing Vendor & Partner relationships. Requirements include 3+ years managing storage, systems, or infrastructure engineering teams in production environments, 8+ years designing and operating large-scale storage systems, hands-on experience deploying, operating, or deeply integrating VAST Data in production environments, experience with Lustre or comparable HPC filesystems, deep understanding of NAND, NVMe, PCIe, storage controllers, and performance characteristics across the stack, proven experience with NFS over RDMA, RDMA-capable transports, or similar technologies, strong Linux internals knowledge, and experience running 24/7 storage platforms with strong incident response, change management, and post-mortem discipline. * **Forward Deployed Engineer (EMEA):** This remote position, part of the Revenue Team, involves ensuring customers experience seamless operations. The salary range is 100,000 - 135,000 EUR. Responsibilities include participating in sales meetings, troubleshooting technical issues, communicating with customers and internal teams, assisting the support team, contributing to product development, and creating and maintaining technical documentation. Requirements include a Bachelor's degree in a relevant field or equivalent professional experience, 3+ years of professional experience in software development, strong problem-solving skills, familiarity with applied AI use cases, excellent communication skills, and attention to detail. The technology stacks used at Runpod, Inc. include: * **Manager, Datacenter Network Engineering:** Team Leadership, Datacenter Network Design, L2/L3 Fabrics, GPU Networking, Global WAN Connectivity, Spine-Leaf Topologies, ECMP Routing, InfiniBand, RoCE, VXLAN, EVPN, Geneve, Multi-tenant Isolation, Operational Excellence, Capacity Planning, Incident Response. * **Manager, HPC Storage Engineer:** Distributed Storage Architecture, Team Management, SAN, NFS, VAST Data, Lustre, Parallel Filesystems, NAND, NVMe, NFS over RDMA, GPU Direct Storage, Linux Internals, Automation, Observability, Vendor Management, Performance Optimization. * **Forward Deployed Engineer (EMEA):** Python, JavaScript, Go, Full-Stack Development, AI, Machine Learning, Troubleshooting, Technical Documentation, Communication, Collaboration, Problem-Solving, Cloud Infrastructure, Customer Support, Scripting, SQL, NoSQL. Runpod, Inc. is a remote-first company with an inclusive, collaborative team. Most roles are remote, and the internal communication platform is Slack. 1. **What is the remote work policy?** All three listed positions are remote. 2. **What is the salary range for the Manager, Datacenter Network Engineering position?** The salary range is $150,000 - $240,000 USD. 3. **What is the salary range for the Manager, HPC Storage Engineer position?** The salary range is $150,000 - $240,000 USD. 4. **What is the salary range for the Forward Deployed Engineer (EMEA) position?** The salary range is 100,000 - 135,000 EUR. 5. **What kind of benefits does Runpod, Inc. offer?** Runpod offers meaningful equity, generous medical, dental & vision plans, and flexible PTO. The Forward Deployed Engineer (EMEA) position also offers a $1,200 Home Office & Equipment Stipend.
About the Company
Runpod provides cost-effective GPU cloud computing services for training, deploying, and scaling AI models. With GPU Cloud, users can spin up an on-demand GPU instance in a few clicks. With Serverless, users can create autoscaling API endpoints for scaling inference on their models in production. Runpod was founded in 2022 and is headquartered in New Jersey.
Quick Facts
Open Positions
3<p>Runpod is pioneering the future of AI and machine learning, offering cutting-edge cloud infrastructure for full-stack AI applications. Founded in 2022, we are a rapidly growing, well-funded company...
<p>Runpod is pioneering the future of AI and machine learning, offering cutting-edge cloud infrastructure for full‑stack AI applications. Founded in 2022, we are a rapidly growing, well‑funded, remote...
<p>Runpod is pioneering the future of AI and machine learning, offering cutting-edge cloud infrastructure for full‑stack AI applications. Founded in 2022, we are a rapidly growing, well‑funded, remote...
