We are looking for an Application Support Engineer to provide production support for cloud-native media applications running on Google Cloud Platform (GCP) and Kubernetes.
This role focuses on L2/L3 application support, incident response, root cause analysis (RCA), and on-call rotations for services built in Go and deployed using GKE, Cloud Run, and GCP Workflows.
You will support platforms for live streaming, video transcoding, Video-on-Demand (VOD), and Media Asset Management (MAM) in high-availability, time-sensitive environments.
Full-time (40h) role with on-call rotations, including weekend coverage.
Responsibilities
- Provide L2/L3 production support for critical media applications
- Troubleshoot application, platform, and infrastructure issues across GCP and Kubernetes
- Analyze logs, metrics, and traces to identify root causes
- Resolve incidents using runbooks and SOPs, escalating when required
- Partner with Engineering and SRE teams, providing clear technical context
- Support live streaming pipelines and video transcoding workflows
- Diagnose issues related to:
- Stream ingest failures
- Transcoder job errors and performance bottlenecks
- Output quality, latency, and availability
- Participate in on-call rotations, including weekends
- Document incidents, RCAs, corrective and preventive actions
- Continuously improve monitoring, alerting, runbooks, and SOPs
Required
- Experience in Application Support, Production Support, or Reliability Engineering roles
- Hands-on experience with Go (code reading and debugging)
- Strong experience with Kubernetes and Google Cloud Platform (GCP)
- Proven experience supporting live streaming and video transcoding workflows
- Experience participating in on-call rotations
- Strong analytical and troubleshooting skills
- Ability to perform effectively during high-pressure incidents
Nice to Have
- Experience in media, broadcast, or streaming platforms
- Familiarity with live event operations
- Experience with Grafana, Prometheus, or similar monitoring tools
- Experience with ServiceNow, PagerDuty, Slack
- Understanding of SRE concepts (SLIs, SLOs, error budgets)
At Devsu, we believe in creating an environment where you can thrive both personally and professionally. By joining our team, you’ll enjoy:
- A stable, long-term contract with opportunities for career growth
- Private health insurance
- A remote-friendly culture that promotes work-life balance
- Continuous training, mentorship, and learning programs to keep you at the forefront of the industry
- Free access to AI training resources and state-of-the-art AI tools to elevate your daily work
- A flexible Paid Time Off (PTO) policy as well as paid holiday days
- Challenging, world-class software projects for clients in the US and LatAm
- Collaboration with some of the most talented software engineers in Latin America and the US, in a diverse work environment
Join Devsu and discover a workplace that values your growth, supports your well-being, and empowers you to make a global impact.