bet365 logo
Site Reliability Engineer
full-timeStoke-on-Trent

Summary

Location

Stoke-on-Trent

Type

full-time

Claim this Company

Are you the employer? Manage your company page directly.

Explore Jobs

About this role

Company Description

At bet365, we're one of the world's leading online gambling companies, revolutionising the industry since 2000. Founded by Denise Coates CBE, we now employ over 9,000 people and serve over 100 million customers in 27 languages. Our focus on In-Play betting has solidified our market-leading position, offering an unmatched experience across 96 sports and 700,000 streaming events. With over 750 concurrent sporting fixtures at peak and more live sports streamed than anyone else in Europe, we handle over 6 billion HTTP requests daily and process more than 2 million bets per hour at peak.

We empower our employees to push boundaries and explore new ideas, cultivating a culture that celebrates and rewards creativity. This offers employees a wealth of opportunities for growth, giving them the opportunity to make a real impact in the world of online gambling. As a forward-thinking company, we’re breaking new ground in software innovation too, redefining what’s possible for our customers worldwide.

Job Description

As a Site Reliability Engineer, you will enhance system reliability, observability and performance through a strong engineering approach and assist with incident resolution and best practices.

You will have software engineering skills, focusing on system reliability and observability. You will monitor the health, performance and availability of critical systems, directly impacting operational efficiency.

Using your engineering expertise, you will implement solutions that enhance reliability, including service instrumentation with tools such as Open Telemetry, improve logging practices and develop features for maintainability. You will also help engineer tools and automation for effective service management.

Collaboration is key, working across multiple functions to integrate reliability and observability best practices into the software development life cycle. By supporting governance standards set by the central teams, you will foster a culture where these principles are integral to development. Your contributions will ensure our systems meet user demands and enhance overall service performance.

This role is eligible for inclusion in the Company’s hybrid working from home policy.

Qualifications

  • Excellent knowledge of Site Reliability Engineering principles, including the creation and management of effective Service Level Indicators (SLI) and Service Level Objectives (SLO) for reliability and customer satisfaction.
  • Knowledge of contemporary observability tools, techniques and best practice including Splunk, New Relic, Grafana and Pager Duty.
  • Excellent knowledge of programming languages including Python, Golang and JavaScript.
  • Knowledge and experience of modern software development techniques and lifecycles.
  • Experience with Infrastructure as Code (IaC) automation and orchestration tools such as Ansible and Terraform.
  • Prior experience working in a large scale, 24/7 enterprise where system uptime and stability is of paramount importance to the Business.
  • Keen interest of industry trends, particularly Platform Engineering.
  • Proficiency in shell scripting for automation and system management tasks.

Additional Information

  • Writing and contributing to code that enhances the reliability and observability of services, including telemetry, operational APIs and tooling.
  • Developing and maintaining tools that facilitate effective management of our systems, ensuring they are operationally efficient and resilient.
  • Working with automation and orchestration platforms to automate manual activity and reduce toil.
  • Building sophisticated dashboards using a range of telemetry data and dash boarding technologies like Grafana, Splunk and New Relic.
  • Maintaining and administering existing monitoring and analytic toolsets.
  • Mentoring colleagues in use of new technologies or practices.
  • Actively participating in live incident resolution and post-mortem analysis, providing effective remediation strategies to improve overall system health and prevent future issues.
  • Driving initiatives to enhance system reliability and observability, contributing to a culture of continuous improvement.
  • Collaborating with the central Site Reliability Engineering and Observability teams to establish and uphold standards for reliability and observability, assisting teams in adhering to these practices.
  • Working with IT Operations, providing and supporting the use of critical tooling to enable increasing levels of value to the Business.

By applying to us you are agreeing to share your Personal Data in accordance with our Recruitment Privacy Notice - https://www.bet365careers.com/privacy-policy

At bet365, we're committed to creating an environment where everyone feels welcome, respected and valued. Where all individuals can grow and develop, regardless of their background. We're Never Ordinary, and we're always striving to be better. If you need any adjustments or accommodations to the recruitment process, at either application or interview, please don’t hesitate to reach out.

  • Workplace Type: Hybrid
  • Department: Platform Engineering
  • Full Time/Part Time: Full Time
  • Shift Pattern: Days
  • 2nd Office Location: UK - Manchester
  • Job Type: Standard
  • Other facts

    Tech stack
    Site Reliability Engineering,Service Level Indicators,Service Level Objectives,Observability Tools,Python,Golang,JavaScript,Infrastructure as Code,Ansible,Terraform,Shell Scripting,Automation,Monitoring,Incident Resolution,Continuous Improvement,Collaboration

    About bet365

    At bet365, we are one of the world's leading online gambling companies, revolutionising the industry since 2000. Founded by Denise Coates CBE, we now employ over 9,000 people and serve over 100 million customers in 27 languages. Our focus on In-Play betting has solidified our market-leading position, offering an unmatched experience across 96 sports and 700,000 streaming events. With over 750 concurrent sporting fixtures at peak and more live sports streamed than anyone else in Europe, we handle over 6 billion HTTP requests daily and process more than 2 million bets per hour at peak.

    Innovation thrives at bet365. We empower our employees to push boundaries and explore new ideas, cultivating a culture that celebrates and rewards creativity. With endless opportunities for growth and collaboration, team members have the chance to make a real impact in the world of online gambling. As a forward-thinking company, we’re breaking new ground in software innovation - together, we’re redefining what’s possible!

    Team size: 5,001-10,000 employees
    LinkedIn: Visit
    Industry: Gambling Facilities and Casinos

    What you'll do

    • As a Site Reliability Engineer, you will enhance system reliability, observability, and performance while assisting with incident resolution. You will implement solutions that improve reliability and develop tools for effective service management.

    Join Clera's Talent Pool

    Get matched with similar opportunities at top startups

    This role is hosted on bet365's careers site.
    Join our talent pool first to get notified about similar roles that match your profile.

    Frequently Asked Questions

    What does a Site Reliability Engineer do at bet365?

    As a Site Reliability Engineer at bet365, you will: as a Site Reliability Engineer, you will enhance system reliability, observability, and performance while assisting with incident resolution. You will implement solutions that improve reliability and develop tools for effective service management..

    Why join bet365 as a Site Reliability Engineer?

    bet365 is a leading Gambling Facilities and Casinos company.

    Is the Site Reliability Engineer position at bet365 remote?

    The Site Reliability Engineer position at bet365 is based in Stoke-on-Trent, England, United Kingdom. Contact the company through Clera for specific work arrangement details.

    How do I apply for the Site Reliability Engineer position at bet365?

    You can apply for the Site Reliability Engineer position at bet365 directly through Clera. Click the "Apply Now" button above to start your application. Clera's AI-powered platform will help match your profile with this opportunity and guide you through the application process. You can also learn more about bet365 on their website.