Site Reliability Engineering Lead | Site Reliability Engineering Lead (f/m/d)

Siemens AG

Erlangen, Bayern, Deutschland
Published Oct 21, 2025
Full-time
Permanent

Job Summary

This Site Reliability Engineering Lead role is crucial for defining and executing the CloudOps and Site Reliability Engineering (SRE) strategy within the company's cloud ecosystem. The successful candidate will drive SRE transformations, ensuring the reliability, availability, security, and performance of productive environments. Daily responsibilities include managing Level 3 operational support, leading incident triage and root cause analysis, and defining/tracking key reliability metrics like SLIs, SLOs, and SLAs. You will operate and optimize cloud infrastructure, harmonize observability and alerting systems using tools like CloudWatch, Datadog, Prometheus, and Grafana, and coordinate on-call incident response using PagerDuty. Key requirements include a Master's degree in Computer Science or Engineering, long-term experience in CloudOps/SRE and managing SaaS product operations, deep AWS knowledge, and excellent English skills. This position offers the opportunity to shape cloud reliability practices for smart infrastructure solutions.

Required Skills

Education

Master's degree in Computer Science, Engineering, or a related field

Experience

  • Long-term experience in CloudOps and Site Reliability Engineering
  • Long-term experience in managing operations for a SAAS Product
  • Proven experience in SRE transformations
  • Profound experience with incident management and escalation processes
  • Experience with monitoring, logging, and alerting tools
  • Experience with ITIL-based support processes and ticketing systems

Languages

English (Fluent)

Additional

  • Availability (On-Call) on weekends