Clera - Your AI talent agent
LoginStart
Start
D
Datavail

Site Reliability Engineer

full-time•Colorado Springs

Summary

Location

Colorado Springs

Type

full-time

Experience

5-10 years

Company links

WebsiteLinkedInLinkedIn

About this role

You will own reliability for core services across multiple clouds, drive automation, and mentor more junior engineers. You will partner with developer teams to embed resilience into feature delivery.

Responsibilities

- Define and maintain SLIs/SLOs, monitor alignment and error budget usage
- Lead incident response and postmortems, implement corrective measures
- Automate operations tasks via tooling (e.g. auto-remediation, scaling rules)
- Build, improve, and maintain CI/CD pipelines, canary deployments, blue/green strategies
- Lead technical discussions with customers to align on reliability, scalability, and performance requirements
- Drive continuous platform improvements across the service lifecycle, including architecture, monitoring, and operational processes
- Implement and extend observability systems (metrics, tracing, log aggregation)
- Optimize performance and cost by tuning cloud services, autoscaling, resource rightsizing
- Design, deploy, and operate containerized workloads using Docker and Kubernetes in production environments
- Collaborate with dev teams to integrate resilience patterns (circuit breakers, bulkheading)
- Participate in architecture discussions around high availability, disaster recovery
- Mentor mid and junior SREs; conduct reliability design reviews

Must-have Qualifications

- 5–8 years of experience in a reliability or operations role
- Cloud-agnostic certification: Terraform Associate, Certified Kubernetes Administrator (CKA), or SRE Foundation
- Cloud provider certification: Professional-level certification in AWS (Solutions Architect), Azure (Solutions Architect Expert), GCP (Professional Cloud Architect), or Oracle Cloud (Architect Professional)
- Solid coding skills (Python, Go, or equivalent)
- Experience with IaC, CI/CD pipelines, and monitoring/observability stacks (Prometheus, Grafana, OpenTelemetry, ELK)
- Comfortable with observability stacks (Prometheus, Grafana, OpenTelemetry, ELK, Jaeger)
- Experience working in distributed systems and production scale services

Nice-to-have Skills

- Exposure to multi-cloud data replication or cross-cloud networks
- Experience with chaos engineering or fault injection


Datavail is a leading provider of data management, application development, analytics, and cloud services, with more than 1,000 professionals helping clients build and manage applications and data via a world-class tech-enabled delivery platform and software solutions across all leading technologies. For more than 17 years, Datavail has worked with thousands of companies spanning different industries and sizes, and is an AWS Advanced Tier Consulting Partner, a Microsoft Solutions Partner for Data & AI and Digital & App Innovation (Azure), an Oracle Partner, and a MySQL Partner.
Datavail’s Team of Oracle Experts Can Save You Time and Money

As an Oracle Platinum Partner with 17 specializations, we have extensive experience with everything Oracle. Our experts have an average of 16 years of experience. They’ve overcome every obstacle in helping clients manage everything from databases, BI analytics, reporting, migrations, and upgrades to monitoring and overall data management.

You can free up your IT resources to focus on growing your business rather than fighting fires. Our Oracle experts can guide you through strategic initiatives or support routine database management.


Datavail’s Comprehensive Oracle Database Services

Datavail offers Oracle consulting services that allow you to take advantage of all the features of the Oracle database. We can also assist you in designing, implementing, and managing a wide range of Oracle applications.

Oracle Database Managed Services

Datavail’s business focuses on helping you use your data to drive business results through cost-saving services. The success of your business depends on how well you understand and manage your data. Our Oracle managed cloud services give you the power to unleash your organization’s potential. We provide comprehensive and technically advanced support for Oracle installations to ensure that your databases are safe, secure, and managed with the utmost level of care.

Our delivery performance in data management leads the industry. We offer highly trained Oracle database administrators via a 24×7, always on, always available, global delivery model. Datavail’s flexible and client focused services always add value to your organization. Our Oracle database managed services and products include:

What you'll do

  • The Site Reliability Engineer will own reliability for core services across multiple clouds and drive automation while mentoring junior engineers. They will also partner with developer teams to embed resilience into feature delivery.

About Datavail

Datavail | Data, Cloud & AI—Built for Real Business Outcomes Datavail is a data, cloud, and AI consultancy that helps organizations turn complex technology environments into clear, measurable business outcomes. We partner with data, technology, and IT leaders to make enterprise data more usable, systems more adaptable, and decisions more informed. Our work sits at the intersection of data management, cloud modernization, enterprise applications, and AI—bringing these disciplines together so they support the business, not slow it down. In a landscape full of tools, platforms, and transformation promises, Datavail focuses on what actually drives progress: • Trusted, well-managed data that teams can rely on • Cloud environments without unnecessary cost or complexity • Enterprise applications that evolve with the business • Practical, responsible AI that delivers value—not experiments We help organizations: • Improve data quality, accessibility, and governance • Turn analytics and AI into everyday decision-making tools • Modernize and optimize cloud and application environments • Reduce operational risk while increasing agility and performance Our Core Capabilities: • Data Management & AI: Data foundations, analytics, AI and machine learning that support real-world decisions • Cloud Services: Cloud modernization, optimization, SRE services, and license optimization • Enterprise Applications: Managed services, upgrades & integrations, digital transformation, and implementation services At Datavail, we believe data only creates value when it’s well managed, well understood, and actively used. Our role is to help organizations move from complexity to clarity—and from data to action.

Ready to join Datavail?

Take the next step in your career journey

Frequently Asked Questions

What does a Site Reliability Engineer do at Datavail?

Toggle
As a Site Reliability Engineer at Datavail, you will: the Site Reliability Engineer will own reliability for core services across multiple clouds and drive automation while mentoring junior engineers. They will also partner with developer teams to embed resilience into feature delivery..

Is the Site Reliability Engineer position at Datavail remote?

Toggle
The Site Reliability Engineer position at Datavail is based in Colorado Springs, Colorado, United States. Contact the company through Clera for specific work arrangement details.

How do I apply for the Site Reliability Engineer position at Datavail?

Toggle
You can apply for the Site Reliability Engineer position at Datavaildirectly through Clera. Click the "Apply Now" button above to start your application. Clera's AI-powered platform will help match your profile with this opportunity and guide you through the application process.
Clera - Your AI talent agent
© 2026 Clera Labs, Inc.TermsPrivacyHelp

Join Clera's Talent Pool

Get matched with similar opportunities at top startups

This role is hosted on Datavail's careers site.
Join our talent pool first to get notified about similar roles that match your profile.