Head of SRE Production Support
Huxley Associates - London, England
Apply NowJob Description
We have a current opportunity for a Head of SRE Production Support on a permanent basis. The position will be based in London. For further information about this position please apply. Requirements 15+ years in IT across operations, infrastructure, development, or support - breadth matters as much as depth 5+ years owning production management in or alongside a trading or front office environment (any asset class) - you understand what a P1 costs a desk at 09:30 Full incident lifecycle ownership: detection, triage, cross-team communication, resolution, post-mortem, and permanent preventive action Cross-team incident command under sustained pressure - simultaneous, coherent communication to traders, engineers, vendors, and senior management SLA, SLO, and error budget design, ownership, and enforcement with internal engineering teams and external counterparties DR and BCP design and testing: runbooks, failover playbooks, and RTOs that are tested under realistic conditions, not just documented Observability strategy: monitoring, alerting, and log pipeline design - you define what good looks like and hold teams to it Capacity planning and infrastructure cost management balancing availability targets against business constraints Vendor, exchange, and broker relationship management: SLA negotiation, escalation frameworks, proactive dependency risk management Experience building an operations function from scratch - hiring, process design, tooling selection, and culture Root cause culture: structured analysis (5 Whys, fault trees) that translates directly into engineering backlog and systemic improvement Cloud infrastructure - Azure preferred, AWS considered; IAM, managed services, automated and auditable deployment pipelines, secrets management Nice to Have Financial market connectivity - exchange feed management, broker API integration, clearing and settlement systems ITIL, SRE, or equivalent framework adoption; experience introducing error budgets into an engineering organisation Python or Bash scripting for operational automation; building internal tooling that makes the ops team itself faster Regulatory reporting obligations or audit trail requirements in a financial services environment What We're Looking For You have been on the 2am call. You found it, fixed it, documented it, and made sure it never happened the same way again. You know that operational excellence in a trading firm is a competitive weapon - and you want to build the best To find out more about Huxley, please visit TPBN1_UKTJ
Created: 2026-03-10