Stijn Polfliet

5 steps for running a Kubernetes environment at scale

Stop guessing about your cluster's health. This five-layer observability model provides the complete visibility needed to scale Kubernetes with confidence.

5 steps for running a Kubernetes environment at scale
#1about 3 minutes

Understanding the challenges of scaling Kubernetes with confidence

Kubernetes offers flexibility and efficiency but its dynamic nature can be complex to manage, requiring a structured approach to gain confidence.

#2about 3 minutes

Introducing a five-layer model for Kubernetes observability

An overview of the five essential layers for running Kubernetes with confidence, from cluster health to complete service observability.

#3about 5 minutes

Visualizing cluster health with the Kubernetes Cluster Explorer

A demonstration of how to use a visual tool to identify pod status, resource consumption, and troubleshoot issues like pending pods or crash loops.

#4about 7 minutes

Monitoring overall cluster health and resource consumption

Use kube-state-metrics and define resource requests and limits to manage cluster capacity and prevent pods from being killed due to memory issues.

#5about 2 minutes

Improving security and performance with small container images

Use purpose-built base images like Alpine instead of generic Linux distributions to reduce image size, improve build times, and minimize security vulnerabilities.

#6about 4 minutes

Tracking dynamic cluster behavior with events and health checks

Implement readiness and liveness probes to inform Kubernetes about pod health and use an observability platform to correlate events with performance issues.

#7about 3 minutes

Correlating log messages for faster troubleshooting

Use a lightweight forwarder like Fluent Bit to centralize logs and correlate them with cluster events and metrics for contextual debugging.

#8about 9 minutes

Using distributed tracing to map microservice communication

Implement distributed tracing to understand request flows, identify performance bottlenecks between services, and view in-process spans for code-level analysis.

#9about 11 minutes

Integrating Prometheus for complete service observability

Leverage the Prometheus ecosystem by forwarding metrics to a central platform using remote write or a direct scraper integration for unified dashboarding.

#10about 9 minutes

Getting started with the New Relic Kubernetes integration

A step-by-step guide on how to install the New Relic agent and its components in your cluster using a guided wizard and Helm charts.

Related jobs
Jobs that call for the skills explored in this talk.

Featured Partners

Related Articles

View all articles
Learning Kubernetes made easy with KubeCampus
Learning to use Kubernetes? KubeCampus by Kasten offers free educational content for all skill levels to get you started!Kubernetes is an open-source system for deploying, scaling and managing containerized applications. It allows you to deploy your ...
Learning Kubernetes made easy with KubeCampus
CH
Chris Heilmann
All the videos of Halfstack London 2024!
Last month was Halfstack London, a conference about the web, JavaScript and half a dozen other things. We were there to deliver a talk, but also to record all the sessions and we're happy to share them with you. It took a bit as we had to wait for th...
All the videos of Halfstack London 2024!
CH
Chris Heilmann
With AIs wide open - WeAreDevelopers at All Things Open 2025
Last week our VP of Developer Relations, Chris Heilmann, flew to Raleigh, North Carolina to present at All Things Open . An excellent event he had spoken at a few times in the past and this being the “Lucky 13” edition, he didn’t hesitate to come and...
With AIs wide open - WeAreDevelopers at All Things Open 2025
DC
Daniel Cranney
Dev Digest 188: CfP time, the risks of NPM and IKEA algorithms
Inside last week’s Dev Digest 188 . 🤖 GitHub Copilot CLI is now in public review 💻 Microsoft is bringing ‘vibe working’ to office apps 🎣 Attackers abuse AI tools to generate captchas in fishing attacks ⚠️ When LLMs autonomously attack 🧠 Common cause...
Dev Digest 188: CfP time, the risks of NPM and IKEA algorithms

From learning to earning

Jobs that call for the skills explored in this talk.

e DevOps Kubernetes

e DevOps Kubernetes

","addresslocality":"valbonne","streetaddress":""},"geo":{"@type":"geocoordinates","longitude":7.009186,"latitude":43.64152}},"industry":"","identifier":{"@type":"propertyvalue","name":"innova
Canton de Valbonne, France

Remote
45-50K
Intermediate
Bash
DevOps
Ansible
+6