Site Reliability Engineer

Helsing

Paris, France

2 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English, German

Experience level

Senior

Job location

Paris, France

Tech stack

Java

API

Artificial Intelligence

Bash

C++

Cloud Computing

Databases

Data Structures

Software Debugging

Linux

Distributed Systems

Python

Machine Learning

Network Protocols

Reliability Engineering

Ansible

Prometheus

Software Engineering

Systems Integration

Rust

Scripting (Bash/Python/Go/Ruby)

Istio

Grafana

Reliability of Systems

Build Management

Templating

Kubernetes

Information Technology

Machine Learning Operations

Terraform

Legacy Systems

Job description

Much of our work takes place in high-security on-premise environments, and we are looking for a Site Reliability Engineer to support our high security environments. Your role as a Site Reliability Engineer will be to design, implement, and manage our on-premise Kubernetes infrastructure. We are looking for engineers with a strong work ethic and prioritisation skills. We value team players who communicate clearly, share knowledge generously, and collaborate effectively to move their team and our mission forward. Responsibilities

Design and build cloud-native infrastructure platforms on-premises, focusing on Kubernetes-based solutions that enable our development teams to operate services at scale. Create robust observability frameworks using Grafana, Prometheus, and distributed tracing to ensure system reliability and performance. Architect and implement secure, multi-tenant Kubernetes clusters with strong access controls, policy-as-code governance, and zero-trust networking between red and black network domains. Develop operators and controllers to automate infrastructure provisioning and compliance. Build and maintain MLOps platforms enabling AI researchers to deploy, monitor, and scale machine learning models in production. Collaborate closely with our Security teams to implement supply chain security, container scanning, and runtime protection across our cloud-native stack., Software only matters when it is actively being used and making a difference. As a Deployed AI Engineer, you will be at the heart of this - taking state-of-the-art software, and integrating it into complex systems. You'll be defining and executing against the end-to-end outcome, and all steps on the way, to delivering novel capabilities in some of the most challenging environments around. For example: you collaborate directly with customer avionics engineers to understand the data structures and APIs used in aircraft mission control systems. Instead of being scared, you embrace the complexity of unfamiliar databases, APIs, or network protocols; you dig into the specification and use your creativity and ingenuity to implement adapters for integrating them with Helsing's cloud infrastructure. By working closely with our partners and customers (both metaphorically and quite literally), you continuously evolve, improve, and operate Helsing software. You own the outcome end-to-end. Responsibilities

Discover and formalise customer requirements, identify bugs, and ship the latest features directly to our users in close collaboration with product teams. Coordinate between customers, partners, and Helsing engineers to showcase Helsing's software and AI capabilities in simulations or with real systems.

Requirements

Scripting: experience in Python, Go, Rust, or Bash/Shell for automation and tooling. Experience with GitOps workflows and CI/CD automation. Kubernetes Expertise: deep experience operating production Kubernetes clusters, writing custom controllers/operators, and implementing service mesh architectures (Istio/Linkerd). Cloud-Native Technologies: hands-on experience with the CNCF ecosystem, e.g., Helm, ArgoCD, Flux, and container runtime security tools like Falco. Observability Stack: expert-level knowledge of Grafana, Prometheus, Loki, Tempo, and OpenTelemetry. Experience building custom dashboards, alerts, and SLI/SLO frameworks. Networking: Expert understanding of networking concepts, protocols, and security. MLOps Platforms: experience with Kubeflow, MLflow, or similar platforms. Infrastructure as Code: proficiency with Terraform, Ansible, and Kubernetes manifest templating. Experience with policy-as-code tools like OPA/Gatekeeper. System Administration: deep understanding of Linux/Unix system administration and highly available, distributed systems. Comfortable building out data and telemetry pipelines for debugging and future-proofing solutions.

Should Apply If

Have a high level of personal integrity, reliability, and attention to detail. Have a software engineering mindset with a passion for building platforms and tools that multiply developer productivity. Have experience running cloud-native workloads in on-premises or air-gapped environments. Are willing to relocate to Munich, London, or Paris., Have a degree in computer science, software engineering, electrical engineering, or other relevant fields. Have broad understanding and creative use of computer systems; this is the Swiss army knife in Helsing's toolbox of world-class talent. Prefer asking questions over stipulating answers. Love figuring things out, digging deeper and deeper into unfamiliar systems until it finally clicks, and you understand how they work. Are not afraid of legacy systems and would rather make them work than give up and try to rewrite them all. Are comfortable with scripting languages such as Bash or Python and have experience with software engineering in C++, Java, Rust, or similar. Can navigate and configure Linux systems and have an understanding of network stacks and database systems. If applying for Germany, it is a requirement that you are able to speak business level or fluent German.

Benefits & conditions

Competitive compensation and stock options. Relocation support. Social and education allowances. Regular company events and all-hands to bring together employees as one team across Europe. A hands-on onboarding program (affectionately labelled "Infraduction"), in which you will be building tooling and applications to be used across the company. This is your opportunity to learn our tech stack, explore the company, and learn how we get things done - all whilst working with other engineering teams from day one.

About the company

Helsing is a defence AI company. Our mission is to protect our democracies. We aim to achieve technological leadership, so that open societies can continue to make sovereign decisions and control their ethical standards. As democracies, we believe we have a special responsibility to be thoughtful about the development and deployment of powerful technologies like AI. We take this responsibility seriously. We are an ambitious and committed team of engineers, AI specialists and customer-facing programme managers. We are looking for mission-driven people to join our European teams - and apply their skills to solve the most complex and impactful problems. We embrace an open and transparent culture that welcomes healthy debates on the use of technology in defence, its benefits, and its ethical implications. Roles