Aarno Aukia
DevOps for AI: running LLMs in production with Kubernetes and KubeFlow
#1about 3 minutes
Applying DevOps principles to machine learning operations
The maturity of software operations from reactive firefighting to automated DevOps provides a model for improving current MLOps practices.
#2about 3 minutes
Defining AI, machine learning, and generative AI
AI is a broad concept that has evolved through machine learning and deep learning to the latest trend of generative AI, which can create new content.
#3about 4 minutes
How large language models generate text with tokens
LLMs work by converting text into numerical tokens and then using a large statistical model to predict the most probable next token in a sequence.
#4about 2 minutes
Using prompt engineering to guide LLM responses
Prompt engineering involves crafting detailed instructions and providing context within a prompt to guide the LLM toward a desired and accurate answer.
#5about 2 minutes
Understanding and defending against prompt injection attacks
User-provided input can be manipulated to bypass instructions or extract sensitive information, requiring defensive measures against prompt injection.
#6about 3 minutes
Advanced techniques like RAG and model fine-tuning
Beyond basic prompts, you can use Retrieval-Augmented Generation (RAG) to add dynamic context or fine-tune a model with specific data for better performance.
#7about 5 minutes
Choosing between cloud APIs and self-hosted models
LLMs can be consumed via managed cloud APIs, which are simple but opaque, or by self-hosting open-source models for greater control and data privacy.
#8about 2 minutes
Streamlining local development with the Ollama tool
Ollama simplifies running open-source LLMs on a local machine for development by managing model downloads and hardware acceleration, acting like Docker for LLMs.
#9about 6 minutes
Running LLMs in production with Kubeflow and KServe
Kubeflow and its component KServe provide a robust, Kubernetes-native framework for deploying, scaling, and managing LLMs in a production environment.
#10about 2 minutes
Monitoring LLM performance with KServe's observability tools
KServe integrates with tools like Prometheus and Grafana to provide detailed metrics and dashboards for monitoring LLM response times and resource usage.
Related jobs
Jobs that call for the skills explored in this talk.
Matching moments
06:19 MIN
Defining LLMOps and understanding its core benefits
From Traction to Production: Maturing your LLMOps step by step
00:20 MIN
The lifecycle for operationalizing AI models in business
Detecting Money Laundering with AI
01:01 MIN
Understanding the role and challenges of MLOps
The Road to MLOps: How Verivox Transitioned to AWS
36:30 MIN
The rise of MLOps and AI security considerations
MLOps and AI Driven Development
29:33 MIN
Applying software engineering discipline to AI development
Navigating the AI Revolution in Software Development
04:20 MIN
Comparing open source tools for serving LLMs
Self-Hosted LLMs: From Zero to Inference
12:16 MIN
Understanding the new AI developer stack and MLOps workflow
Developer Experience, Platform Engineering and AI powered Apps
00:11 MIN
The challenge of operationalizing production machine learning systems
Model Governance and Explainable AI as tools for legal compliance and risk management
Featured Partners
Related Videos
The state of MLOps - machine learning in production at enterprise scale
Bas Geerdink
From Traction to Production: Maturing your LLMOps step by step
Maxim Salnikov
LLMOps-driven fine-tuning, evaluation, and inference with NVIDIA NIM & NeMo Microservices
Anshul Jindal
Self-Hosted LLMs: From Zero to Inference
Roberto Carratalá & Cedric Clyburn
DevOps for Machine Learning
Hauke Brammer
One AI API to Power Them All
Roberto Carratalá
Creating Industry ready solutions with LLM Models
Vijay Krishan Gupta & Gauravdeep Singh Lotey
How to Avoid LLM Pitfalls - Mete Atamel and Guillaume Laforge
Meta Atamel & Guillaume Laforge
Related Articles
View all articles
.gif?w=240&auto=compress,format)
.gif?w=240&auto=compress,format)

From learning to earning
Jobs that call for the skills explored in this talk.

AI Systems and MLOps Engineer for Earth Observation
Forschungszentrum Jülich GmbH
Jülich, Germany
Intermediate
Senior
Linux
Docker
AI Frameworks
Machine Learning

MLOps Engineer (Kubernetes, Cloud, ML Workflows)
FitNext Co
Charing Cross, United Kingdom
Remote
Intermediate
DevOps
Python
Docker
Grafana
+6

Machine Learning (ML) Engineer Expert - frameworks MLOps / Python / Orchestration/Pipelines
ASFOTEC
Canton de Lille-6, France
Senior
GIT
Bash
DevOps
Python
Gitlab
+6


Machine Learning Ops (MLOps) Engineer
Spait Infotech Private Limited
Sheffield, United Kingdom
Remote
£55-120K
Intermediate
ETL
Azure
Scrum
+12

AI DevOps Engineer
Optimyze Consulting
Murnau a. Staffelsee, Germany
€70-85K
Intermediate
DevOps
Docker
Kubernetes
Machine Learning
+1



Machine Learning Engineer - Large Language Models (LLM) - Startup
Startup
Charing Cross, United Kingdom
PyTorch
Machine Learning