Site Reliability Engineer Infrastructure (Storage) | Site Reliability Engineer (w/m/d) Infrastructure (Storage) / 1220

IONOS SE

Karlsruhe, Baden, Baden-Württemberg, Deutschland
Published Oct 31, 2025
Full-time
Permanent

Job Summary

This Site Reliability Engineer role focuses on ensuring the performance, reliability, and scalability of storage infrastructure through architectural reviews and advanced automation. The successful candidate will develop robust automation solutions for storage provisioning, monitoring, and scaling using tools like Ansible, SaltStack, Terraform, and scripting languages such as Python and Go. Key responsibilities include implementing self-healing and alerting mechanisms, establishing observability (metrics, logs, tracing) for storage systems, and participating in on-call rotation to analyze and resolve complex performance issues. A strong background in Linux system engineering, storage infrastructure, and deep knowledge of protocols like RDMA, InfiniBand, and RoCE are essential. This is a full-time, permanent position based in Berlin, requiring a commitment to security protocols and offering a hybrid work model.

Required Skills

Education

Bachelor's or Master's degree in Computer Science, Electrical Engineering, or a related technical field (Nice-to-have)

Experience

  • At least 5 years of experience in Linux system engineering, storage infrastructure, or SRE roles
  • Professional experience with Linux MD-RAID (mdadm) and LVM
  • Practical experience in Linux performance tuning and debugging the network stack (ethtool, perf, tcpdump, ibstat, ibtop)
  • Secure handling of configuration management tools like SaltStack or Ansible
  • Experience with monitoring solutions like Prometheus, Loki, and Grafana

Languages

Not specified

Additional

  • Candidates must consent to a security screening at the end of the application process; Location constraint: Berlin; Permanent contract