AI Systems Engineer - LLM Execution & Infrastructure Optimization | AI Systems Engineer (m/w/d) - LLM Execution & Infra Optimization

Deutsches Krebsforschungszentrum

Heidelberg, Neckar, Baden-Württemberg, Deutschland
Published Sep 30, 2025
Full-time
Permanent

Job Summary

This AI Systems Engineer role is crucial for advancing cancer research by managing and optimizing the technical infrastructure supporting Large Language Models (LLMs) at the German Cancer Research Center (DKFZ). The core mission is to design, implement, and scale the high-end GPU infrastructure, ensuring highly performant and scalable generative AI solutions for all users. Day-to-day tasks involve optimizing LLM inference through advanced techniques like quantization and KV-caching, integrating hardware-specific adjustments on NVIDIA GPU platforms (CUDA), and managing secure, scalable API Gateways (e.g., Kong, KrakenD). The engineer will also automate operations using tools like Ansible and implement comprehensive monitoring with Prometheus/Grafana. Candidates must possess a Master's degree in Computer Science and deep practical experience with LLM inference engines and container orchestration, contributing directly to the DKFZ's vital scientific mission.

Required Skills

Education

Master's degree in Computer Science or a related field

Experience

  • Practical experience with LLM inference engines (vLLM, Ollama)
  • Practical experience with optimization techniques (Quantization, KV-Caching, Parallelization)
  • Practical experience with current LLM technologies (Mixture of Experts, Reasoning) and RAG/MCP
  • Professional experience in Linux system administration
  • Professional experience with Cloud and Container technologies (OpenStack, Kubernetes, Docker/Podman)
  • Experience in DevOps processes (GitLab, CI/CD Pipelines) (Desirable)

Languages

German (Fluent)English (Fluent)

Additional

  • Must provide proof of immunity against measles (required by Infektionsschutzgesetz/IfSG).