This role is for one of the Weekday's clients

Min Experience: 3 years

Location: Bengaluru

JobType: full-time

We are looking for a highly driven Technical Lead to work across a multi-product SaaS platform, owning system reliability, scalability, and technical execution. This is a horizontal leadership role spanning multiple products and core systems, ensuring platforms remain fast, secure, and resilient under scale and peak traffic conditions.

This is a hands-on technical leadership role, focused on architecture, reliability, and execution—not people management.

Key Responsibilities1. System Reliability & Performance (Primary Ownership)

Own and improve reliability metrics across products, including uptime, SLAs, and latency (P95).
Monitor and reduce application errors, bug leakage, and system failures.
Ensure correctness of distributed systems involving synchronous and asynchronous workflows.
Optimize queue processing, worker throughput, and caching layers (e.g., Redis).
Prepare systems for high-traffic events and peak load scenarios.
Lead root cause analysis and drive permanent, systemic fixes.
Act as the technical owner for incident resolution and long-term prevention.

2. Architecture & Scalability

Collaborate with senior technical stakeholders to evolve platform architecture.
Improve API design, data models, and system boundaries.
Design scalable distributed system patterns such as idempotent workflows, retries, batching, and fan-out orchestration.
Build and scale asynchronous pipelines for high-volume workloads.
Plan capacity for traffic spikes and introduce resilience patterns like circuit breakers and fail-safes.

3. Hands-On Engineering Leadership

Lead and review technical designs across teams and products.
Unblock engineers on complex architectural or performance challenges.
Own and drive cross-product refactors and technical debt reduction.
Enforce clean code standards, testing practices, and observability-first development.
Mentor engineers on debugging, system design, and performance optimization.

4. Observability & Monitoring

Define and maintain SLIs and SLOs across critical systems.
Build dashboards, alerts, and monitoring using logs, metrics, and traces.
Ensure issues are detected proactively before impacting users.
Work closely with platform teams to instrument distributed workflows end-to-end.

5. Security & Compliance

Ensure secure coding practices and adherence to compliance requirements (e.g., SOC 2).
Enforce proper secrets management, access controls, and audit logging.
Maintain data integrity, API security, and permission correctness across systems.

6. Cross-Functional Collaboration

Partner with Product teams to translate requirements into technically sound solutions.
Work with Support and Customer Success teams to deeply understand production issues.
Collaborate with Core Systems and Infrastructure teams to improve platform stability.
Align with QA teams to define testing strategies, including load, integration, and failure testing.

RequirementsMust Have

3–4+ years of backend engineering experience (Python preferred).
Strong understanding of distributed systems and backend architecture.
Deep experience with SQL databases, data modeling, and query optimization.
Hands-on expertise with Redis, queues, async jobs, retries, and background processing.
Strong debugging skills across application and infrastructure layers.
Proven ability to lead technical decisions across multiple teams.
Experience improving system reliability and performance at scale.
Excellent communication and collaboration skills.

Nice to Have

Experience with observability tools such as Datadog, Sentry, or Elasticsearch.
Exposure to CRM integrations or large enterprise systems.
Prior ownership of reliability for multi-product SaaS platforms.
Familiarity with secure coding practices and compliance frameworks.

What Success Looks Like

0–3 Months

Gain a deep understanding of platform architecture and core systems.
Deliver quick reliability and performance improvements.
Become a go-to technical problem solver across teams.

4–6 Months

Establish clear SLIs and SLOs for key systems.
Introduce architectural guardrails and reduce operational noise.
Significantly lower error rates and production issues.

7–12 Months

Achieve high availability (99.9%+) across core platforms.
Ensure predictable and resilient async pipelines.
Improve performance under peak traffic conditions.
Enable faster engineering velocity through cleaner, more stable systems.

Skills

Backend Engineering
Distributed Systems
System Reliability
Relational Databases
Platform Scalability

Summary

Location

Type

Company links

About this role

Other facts

About Weekday AI

What you'll do

Ready to join Weekday AI?

Frequently Asked Questions

What does a Technical Lead do at Weekday AI?

Why join Weekday AI as a Technical Lead?

Is the Technical Lead position at Weekday AI remote?

How do I apply for the Technical Lead position at Weekday AI?