Clera - Your AI talent agent
LoginStart
Start
Z
Zuora

Senior Site Reliability Engineer

full-time•Chennai

Summary

Location

Chennai

Type

full-time

Experience

10+ years

Company links

WebsiteLinkedInLinkedIn

About this role

<p>Company Overview</p> <p>At Zuora, we do <a href="https://www.zuora.com/modern-business/">Modern Business</a>. We’re helping people subscribe to new ways of doing business that are better for people, companies and ultimately the planet. It’s an approach resulting from the shift to the Subscription Economy that puts customers first by building recurring relationships instead of one-time product sales and focuses on sustainable growth. Through our leading expertise and multi-product suite, we are transforming all industries and working with the world’s most innovative companies to monetize new business models, nurture subscriber relationships and optimize their digital experiences.</p> <p>The Team &amp; Role</p> <p>Zuora’s Cloud Engineering organization owns the reliability, scalability, and operational excellence of our global, customer-facing SaaS platforms. Operating across the US, India, Beijing, Costa Rica, and remote locations, we follow a follow-the-sun model to deliver 24x7x365 reliability for mission-critical systems. The team partners closely with Engineering, Security, Customer Support, Global Services, and Product to ensure customer trust, platform resilience, and operational efficiency.</p> <p>We are seeking a <strong>Senior Site Reliability Engineer</strong> to play a <strong>technical leadership role</strong> in advancing Zuora’s reliability strategy with a <strong>strong focus on AI-driven automation and intelligent operations</strong>. This role goes beyond execution and requires ownership of complex systems, definition of new approaches, and influence across teams. The ideal candidate brings deep SRE expertise combined with an AI-centric mindset to design, build, and operationalize intelligent automation at scale.</p> <p><strong>Our Tech Stack:</strong> AWS, Microservices, Kafka, Kubernetes, Terraform, Jenkins, Puppet, Python, Linux</p> <p><strong>AI &amp; Automation Focus:</strong> AI-assisted operations, intelligent alerting, auto-remediation, predictive reliability, workflow automation</p> <p>&nbsp;</p> <p>What you’ll do</p> <p>&nbsp;</p> <p><strong>Reliability Architecture &amp; Platform Strategy: </strong>Own and evolve the reliability architecture of large-scale, distributed SaaS systems by defining SLOs, SLIs, error budgets, and resilience patterns aligned with business objectives. Drive system-level improvements across services and regions, proactively identifying architectural risks, capacity constraints, and failure modes, while influencing platform and application design to improve long-term reliability and operability.</p> <p><strong>AI-Driven Automation &amp; Intelligent Operations: </strong>Design, build, and operationalize AI-powered automation to reduce operational toil and improve system stability. Apply AI and machine learning techniques to incident detection, anomaly identification, root cause analysis, auto-remediation, and capacity forecasting, enabling proactive and predictive reliability management at scale.</p> <p><strong>Advanced Cloud &amp; Infrastructure Engineering: </strong>Lead the design and operation of complex AWS-based infrastructure and Kubernetes platforms, optimizing for availability, security, and cost efficiency. Define advanced Infrastructure-as-Code patterns using Terraform and configuration management tools to support scalable, repeatable, and policy-driven environments across multiple stages and regions.</p> <p><strong>Incident Leadership &amp; Operational Excellence: </strong>Act as a technical leader during high-severity production incidents, driving structured response, decision-making, and recovery. Establish intelligent incident response mechanisms using automation, AI-assisted diagnostics, and enriched runbooks, while leading deep post-incident analysis focused on systemic improvements rather than short-term fixes.</p> <p><strong>Technical Leadership &amp; Cross-Functional Influence: </strong>Influence reliability outcomes beyond the SRE team by partnering closely with Engineering, Product, and Security stakeholders. Provide technical mentorship to senior and mid-level engineers, guide adoption of best practices, and contribute to the development of new reliability standards, tooling strategies, and operational policies across the organization.</p> <p>Your experience</p> <p><strong>Required Qualifications</strong></p> <ul> <li>8+ years of hands-on experience in Site Reliability Engineering, DevOps, or large-scale production operations.</li> <li>Advanced expertise in AWS, including architecture design across services such as EC2, EKS, VPC, IAM, RDS, S3, and CloudWatch.</li> <li>Deep experience with Infrastructure-as-Code using Terraform, including complex modules, state management, and governance.</li> <li>Strong programming and automation skills using Python and Shell; experience building production-grade automation systems.</li> <li>Expert-level Linux systems knowledge, including performance tuning, security hardening, and deep troubleshooting.</li> <li>Proven experience operating distributed systems and data streaming platforms such as Kafka in high-throughput environments.</li> <li>Demonstrated ability to work independently on complex, ambiguous problems with broad organizational impact.</li> <li>Proven technical leadership experience driving large, cross-team reliability or infrastructure initiatives, including setting technical direction, influencing design decisions, and mentoring engineers to deliver measurable outcomes at scale.</li> </ul> <p><strong>AI &amp; Automation Expertise</strong></p> <ul> <li>Practical experience designing or implementing AI/ML-driven automation in operations, reliability, or platform engineering.</li> <li>Experience integrating AI capabilities into monitoring, alerting, incident response, or workflow automation systems.</li> <li>Strong understanding of how AI can be safely and effectively applied in production environments.</li> </ul> <p><strong>Nice to haves:</strong></p> <ul> <li>Experience with advanced observability platforms (Prometheus, Grafana, ELK, or similar) enhanced with AI-driven insights.</li> <li>Familiarity with predictive analytics, anomaly detection, or AIOps platforms.</li> <li>Experience influencing architectural decisions at a platform or product level.</li> <li>Prior experience operating in a 24/7, global, high-availability SaaS environment.</li> </ul> <p>#ZEOLife at Zuora</p> <p>As an industry pioneer, our work is constantly evolving and challenging us in new ways that require us to think differently, iterate often and learn constantly—it’s exciting. Our people, whom we refer to as “ZEOs" are empowered to take on a mindset of ownership and make a bigger impact here. Our teams collaborate deeply, exchange different ideas openly and together we’re making what’s next possible for our customers, community and the world.&nbsp;</p> <p>As part of our commitment to building an inclusive, high-performance culture where ZEOs feel inspired, connected and valued, we support ZEOs with:</p> <ul> <li>Competitive compensation, variable bonus and performance reward opportunities, and retirement programs</li> <li>Medical Insurance</li> <li>Generous, flexible time off&nbsp;</li> <li>Paid holidays, “wellness” days and company wide end of year break</li> <li>6 months fully paid parental leave&nbsp;</li> <li>Learning &amp; Development stipend</li> <li>Opportunities to volunteer and give back, including charitable donation match</li> <li>Free resources and support for your mental wellbeing&nbsp;&nbsp;</li> </ul> <p>Specific benefits offerings may vary by country and can be viewed in more detail during your interview process.</p> <p>Location &amp; Work Arrangements</p> <p>Organizations and teams at Zuora are empowered to design efficient and flexible ways of working, being intentional about scheduling, communication, and collaboration strategies that help us achieve our best results. In our dynamic, globally distributed company, this means balancing flexibility and responsibility — flexibility to live our lives to the fullest, and responsibility to each other, to our customers, and to our shareholders. For most roles, we offer the flexibility to work both remotely and at Zuora offices.</p> <p>Our Commitment to an Inclusive Workplace</p> <p>Think, be and do you! At Zuora, different perspectives, experiences and contributions matter. Everyone counts. Zuora is proud to be an Equal Opportunity Employer committed to creating an inclusive environment for all.</p> <p>Zuora does not discriminate on the basis of, and considers individuals seeking employment with Zuora without regards to, race, religion, color, national origin, sex (including pregnancy, childbirth, reproductive health decisions, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, genetic information, political views or activity, or other applicable legally protected characteristics.</p> <p>We encourage candidates from all backgrounds to apply. Applicants in need of special assistance or accommodation during the interview process or in accessing our website may contact us by sending an email to assistance(at)zuora.com.</p>

What you'll do

  • The Senior Site Reliability Engineer will own and evolve the reliability architecture of large-scale SaaS systems and design AI-powered automation to improve system stability. This role requires technical leadership during high-severity incidents and collaboration with cross-functional teams to influence reliability outcomes.

About Zuora

Zuora was born out of a vision that we could evangelize a fundamentally new way of doing business by shifting the focus of companies to deliver recurring, people-centric services instead of a one-time sale of products. This is how we coined the term, the Subscription Economy®. Today, we see others evangelizing this term, and building entire communities around it. The Subscription Economy isn’t (and never was) just about subscription business models but, direct, recurring relationships with customers through any business model. Subscriptions were only just scratching the surface and now, the market recognizes the Subscription Economy for what it truly is-a relationship-centric economy. Companies have realized that the path to growth going forward is to establish direct, digital relationships with their customers, and to nurture and monetize these relationships through an ever growing set of digital services. Alongside this evolution, Zuora has been there every step of the way. We started with Zuora Billing, and have expanded our award-winning multi-product portfolio to include Zuora Revenue, Zuora Payments and Zuora Platform. More recently, we’ve added Zephr and Togai to our family, further expanding our capabilities to serve as an intelligent hub that monetizes the complete quote to cash and revenue recognition process at scale. We call this Monetization.

Ready to join Zuora?

Take the next step in your career journey

Frequently Asked Questions

What does a Senior Site Reliability Engineer do at Zuora?

Toggle
As a Senior Site Reliability Engineer at Zuora, you will: the Senior Site Reliability Engineer will own and evolve the reliability architecture of large-scale SaaS systems and design AI-powered automation to improve system stability. This role requires technical leadership during high-severity incidents and collaboration with cross-functional teams to influence reliability outcomes..

Is the Senior Site Reliability Engineer position at Zuora remote?

Toggle
The Senior Site Reliability Engineer position at Zuora is based in Chennai, Tamil Nadu, India. Contact the company through Clera for specific work arrangement details.

How do I apply for the Senior Site Reliability Engineer position at Zuora?

Toggle
You can apply for the Senior Site Reliability Engineer position at Zuora directly through Clera. Click the "Apply Now" button above to start your application. Clera's AI-powered platform will help match your profile with this opportunity and guide you through the application process.
Clera - Your AI talent agent
© 2026 Clera Labs, Inc.TermsPrivacyHelp

Join Clera's Talent Pool

Get matched with similar opportunities at top startups

This role is hosted on Zuora's careers site.
Join our talent pool first to get notified about similar roles that match your profile.