Senior Site Reliability Engineer

full-time•Coimbatore, Hyderabad

Summary

Location

Coimbatore, Hyderabad

Type

full-time

Experience

5-10 years

Company links

Website LinkedIn

About this role

Job Overview: Drive reliability and operational maturity for Kubernetes workloads on GKE through safe rollout patterns, high-signal observability, resilient IaC, and effective incident response. Collaborate with developers to harden CI/CD pipelines and address infrastructure concerns within application code. Key responsibilities: <ul> <li>Design and maintain resilient deployment patterns (blue-green, canary, GitOps syncs) across services.</li> <li>Instrument and optimize logs, metrics, traces, and alerts to reduce noise and improve signal.</li> <li>Review backend code (e.g., Django, Node.js, Go, Java) with a focus on infra touchpoints like database usage, timeouts, error handling, and memory consumption.</li> <li>Tune and troubleshoot GKE workloads, HPA configs, network policies, and node pool strategies.</li> <li>Improve or author Terraform modules for infrastructure resources (e.g., VPC, CloudSQL, Secrets, Pub/Sub).</li> <li>Diagnose production issues from logs, traces, dashboards, and lead or support incident response.</li> <li>Reduce config drift across environments and standardize secrets, naming, and resource tagging.</li> <li>Collaborate with developers to harden delivery pipelines, standardize rollout readiness, and clean up infra smells in code.</li> </ul> Key skills: <ul> <li>Have 4–6+ years of experience in backend or infra-focused engineering roles (e.g., SRE, platform, DevOps, or fullstack).</li> <li>Can confidently write or review production-grade code and infra-as-code (Terraform, Helm, GitHub Actions, etc.).</li> <li>Have deep hands-on experience with Kubernetes in production, ideally on GKE, including workload autoscaling and ingress strategies.</li> <li>Understand cloud concepts like IAM, VPCs, secret storage, workload identity, and CloudSQL performance characteristics.</li> <li>Think in systems: you understand cascading failure, timeout boundaries, dependency health, and blast radius.</li> <li>Regularly contribute to incident mitigation or long-term fixes (not just closing alerts).</li> <li>Can influence through well-written PRs, documentation, and thoughtful design reviews.</li> </ul> Good to have: <ul> <li>Exposure to GitOps tooling such as ArgoCD or FluxCD.</li> <li>Experience developing or integrating Kubernetes operators.</li> <li>Familiarity with service-level indicators (SLIs), service-level objectives (SLOs), and structured alerting.</li> </ul> Tools and Expectations: <ul> <li>Datadog - Monitor infrastructure health, capture service-level metrics, reduce alert fatigue through high signal thresholds.</li> <li>PagerDuty - Own incident management pipeline. Route alerts by severity and align with business SLAs.</li> <li>GKE / Kubernetes - Improve cluster stability and workload isolation. Define auto-scaling configurations and tune for efficiency.</li> <li>Helm / GitOps (ArgoCD/Flux) - Validate release consistency across clusters. Monitor sync status and rollout safety.</li> <li>Terraform Cloud - Support DR planning and detect infrastructure changes through state comparisons.</li> <li>CloudSQL / Cloudflare - Diagnose DB and networking issues. Monitor latency, enforce access patterns, and validate WAF usage.</li> <li>Secret Management - Audit access to secrets, apply short-lived credentials, and define alerts for abnormal usage.</li> </ul>

What you'll do

The Senior Site Reliability Engineer will drive reliability and operational maturity for Kubernetes workloads on GKE, focusing on safe rollout patterns and effective incident response. They will collaborate with developers to enhance CI/CD pipelines and address infrastructure concerns within application code.

About Orion Innovation Naukri

Orion Innovation delivers next-generation solutions in Data, AI, Cloud, and Digital Experience, empowering organizations to innovate, scale, and embrace future technologies. With deep software engineering expertise and a strong understanding of industry-specific challenges, we build data-driven products and solutions that enhance customer experiences, accelerate growth, and drive long-term value. Envision what's next. Build what matters. For more information, visit www.orioninc.com.

Ready to join Orion Innovation Naukri?

Take the next step in your career journey

Frequently Asked Questions

What does a Senior Site Reliability Engineer do at Orion Innovation Naukri?

As a Senior Site Reliability Engineer at Orion Innovation Naukri, you will: the Senior Site Reliability Engineer will drive reliability and operational maturity for Kubernetes workloads on GKE, focusing on safe rollout patterns and effective incident response. They will collaborate with developers to enhance CI/CD pipelines and address infrastructure concerns within application code..

Is the Senior Site Reliability Engineer position at Orion Innovation Naukri remote?

The Senior Site Reliability Engineer position at Orion Innovation Naukri is based in Coimbatore, Tamil Nadu, India and Hyderabad, India. Contact the company through Clera for specific work arrangement details.

How do I apply for the Senior Site Reliability Engineer position at Orion Innovation Naukri?

You can apply for the Senior Site Reliability Engineer position at Orion Innovation Naukridirectly through Clera. Click the "Apply Now" button above to start your application. Clera's AI-powered platform will help match your profile with this opportunity and guide you through the application process.

About this role

What you'll do

The Senior Site Reliability Engineer will drive reliability and operational maturity for Kubernetes workloads on GKE, focusing on safe rollout patterns and effective incident response. They will collaborate with developers to enhance CI/CD pipelines and address infrastructure concerns within application code.

About Orion Innovation Naukri

Senior Site Reliability Engineer

Summary

Location

Type

Experience

Company links

About this role

What you'll do

About Orion Innovation Naukri

Ready to join Orion Innovation Naukri?

Frequently Asked Questions

What does a Senior Site Reliability Engineer do at Orion Innovation Naukri?

Is the Senior Site Reliability Engineer position at Orion Innovation Naukri remote?

How do I apply for the Senior Site Reliability Engineer position at Orion Innovation Naukri?

Senior Site Reliability Engineer

Summary

Location

Type

Experience

Company links

About this role

What you'll do

About Orion Innovation Naukri

Ready to join Orion Innovation Naukri?

Frequently Asked Questions

What does a Senior Site Reliability Engineer do at Orion Innovation Naukri?

Is the Senior Site Reliability Engineer position at Orion Innovation Naukri remote?

How do I apply for the Senior Site Reliability Engineer position at Orion Innovation Naukri?

Join Clera's Talent Pool

Join Clera's Talent Pool