Senior On Premise Platform Engineer

Ladybird

Failsworth, United Kingdom

12 days ago

Role details

Contract type

Temporary contract

Employment type

Part-time (≤ 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Compensation

£ 101K

Job location

Remote

Failsworth, United Kingdom

Tech stack

Artificial Intelligence

Cloud Computing

Concurrency Controls

Continuous Integration

Data Centers

Linux

DevOps

Disaster Recovery

Failover

Prometheus

Data Streaming

Management of Software Versions

Data Logging

Grafana

Gitlab

Build Management

Gitlab-ci

Kubernetes

Bare Metal

Kafka

Terraform

Job description

On-Prem & Leased Data-Centre Platform Ownership

Design and build on-premise infrastructure hosted in leased UK data centres
Architect and operate bare-metal Kubernetes clusters (control plane + workers)
Own compute, networking, storage, Linux OS, and platform architecture
Design platforms capable of 99.99% availability
Plan and execute capacity management, failover, and disaster recovery
Operate GPU-enabled infrastructure for AI inference and training
Build systems suitable for NHS 999 and emergency communications workloads

Kubernetes, CI/CD & Automation (GitLab)

Design and maintain GitLab CI/CD pipelines (build, test, deploy)
Automate:
Infrastructure provisioning (Terraform / IaC)
Kubernetes deployments
AI model and application releases
Implement GitOps workflows
Own day-2 operations, including upgrades, patching, and rollbacks
Minimise deployment risk in safety-critical environments

Real-Time Streaming & Telecoms Systems

Build and operate Kafka-based streaming platforms
Support sub-second latency event processing
Design for traffic spikes, back-pressure, and failure scenarios
Ensure predictable behaviour under 999 call surges
Optimise systems for latency, throughput, and resilience

MLOps & AI Platform Infrastructure

Operate production AI inference platforms (KServe, Seldon, Triton, or similar)
Enable GPU scheduling, isolation, and concurrency controls
Support model versioning, retraining pipelines, and lifecycle management
Implement:
Canary releases
Versioned deployments
Safe rollback paths
Work closely with AI engineers, retaining platform ownership

Reliability, Security & NHS Compliance

Build observability using Prometheus, Grafana, and centralised logging
Define and monitor SLIs, SLOs, latency, uptime, and error budgets
Lead incident response and root-cause analysis
Implement least-privilege access, secrets management, and audit controls
Harden platforms for NHS, telecoms, and regulated environments

Requirements

Do you have experience in Terraform?, We are seeking a Principal Platform Engineer with 7+ years of hands-on experience designing, building, and operating on-premise, bare-metal platforms in leased data-centre environments., Candidates must meet most of the following:

7+ years hands-on platform / infrastructure engineering experience
Proven experience building on-prem or private-cloud platforms
Experience operating leased data-centre infrastructure
Bare-metal Kubernetes (self-managed, not EKS / AKS / GKE)
Strong Linux, networking, and storage fundamentals
GitLab CI/CD pipeline design and ownership
Experience with telecommunications or NHS environments
Ownership of production systems with strict uptime requirements

Strongly Preferred

NHS 999, emergency services, or healthcare platforms
Telecommunications background (BT, Vodafone, carrier networks)
Kafka and real-time streaming in production
GPU-based AI inference workloads
Terraform and Infrastructure as Code
Experience in regulated, mission-critical environments

What This Role Is Not

Cloud-only DevOps
Data science or ML research
Junior or mid-level engineering
Platform consumption or inherited systems, If you have 7+ years of experience delivering telecoms-grade or NHS-grade platforms, and are comfortable owning systems where failure is not an option, we want to hear from you., * hands-on platform engineering, including building on-prem: 7 years (required)

Benefits & conditions

Job Types: Part-time, Permanent, Temporary, Fixed term contract, Temp to perm, Zero hours contract, Volunteer, Internship Contract length: 12-18 months

Pay: £48,973.65-£100,679.17 per year

Expected hours: 10 - 20 per week

Benefits: