Lead Site Reliability Engineer
TechNET IT Recruitment Ltd - bournemouth, south west england
Apply NowJob Description
MY client are transforming observability with a modern, full-stack platform that delivers logs, metrics, traces, and security monitoring — cutting costs by up to 70% while boosting efficiency.They are looking for a Lead SRE to own and elevate our Alerting & Incident Management platform. You’ll be the driving force behind reliability, customer satisfaction, and product excellence — ensuring smooth alert management, fewer engineering interruptions, and a best-in-class incident response experience. This role blends technical depth, customer impact, and product strategy — perfect for someone who thrives at the intersection of engineering, incident response, and product innovation.What You’ll DoChampion customer experience by speeding up alert resolution and reducing interruptions for engineers.Build solutions to common pain points, shaping roadmaps, documentation, and technical knowledge.Develop benchmarking tools to improve performance, reliability, and scalability.Stay ahead of incident management trends to drive new workflows and product improvements.Mentor teams and lead with clear, impactful communication.What We’re Looking For5+ years in software engineering, DevTools, or infrastructure.Strong expertise in incident management, alert routing, and large-scale orchestration.SaaS or incident management platform experience (PagerDuty, OpsGenie, etc. a plus).Solid technical foundation with cloud/distributed systems.Excellent communicator, comfortable working across US/IL time zones.Bonus: leadership experience, SRE/DevOps background, knowledge of SLO/SLA practices.
Created: 2025-09-10