Kevin Klues
From foundation model to hosted AI solution in minutes
#1about 3 minutes
Introducing the IONOS AI Model Hub for easy inference
The IONOS AI Model Hub provides a simple REST API for accessing open-source foundation models and a vector database for RAG.
#2about 1 minute
Exploring the curated open-source foundation models available
The platform offers leading open-source models like Meta Llama 3 for English, Mistral for European languages, and Stable Diffusion XL for image generation.
#3about 7 minutes
How to implement RAG with a single API call
Retrieval-Augmented Generation (RAG) is simplified by abstracting vector database lookups and prompt augmentation into one API request using collection IDs and queries.
#4about 1 minute
Building end-to-end AI solutions in European data centers
Combine the AI Model Hub with IONOS Managed Kubernetes to build and deploy full AI applications within German data centers for data sovereignty.
#5about 3 minutes
Enabling direct GPU access within managed Kubernetes
The NVIDIA GPU Operator will enable direct consumption of GPU resources within IONOS Managed Kubernetes by automatically installing necessary drivers and components.
#6about 3 minutes
Deploying custom inference workloads with NVIDIA NIMs
Use the GPU Operator to request GPUs in a pod spec and deploy NVIDIA Inference Microservices (NIMs) to run custom, containerized AI models on your own infrastructure.
Related jobs
Jobs that call for the skills explored in this talk.
Wilken GmbH
Ulm, Germany
Senior
Kubernetes
AI Frameworks
+3
ROSEN Technology and Research Center GmbH
Osnabrück, Germany
Senior
TypeScript
React
+3
Matching moments
04:28 MIN
Building an open source community around AI models
AI in the Open and in Browsers - Tarek Ziadé
06:44 MIN
Using Chrome's built-in AI for on-device features
Devs vs. Marketers, COBOL and Copilot, Make Live Coding Easy and more - The Best of LIVE 2025 - Part 3
01:02 MIN
AI lawsuits, code flagging, and self-driving subscriptions
Fake or News: Self-Driving Cars on Subscription, Crypto Attacks Rising and Working While You Sleep - Théodore Lefèvre
00:48 MIN
The shift to on-device AI models in smartphones
Fake or News: Coding on a Phone, Emotional Support Toasters, ChatGPT Weddings and more - Anselm Hannemann
03:55 MIN
The hardware requirements for running LLMs locally
AI in the Open and in Browsers - Tarek Ziadé
09:10 MIN
How AI is changing the freelance developer experience
WeAreDevelopers LIVE – AI, Freelancing, Keeping Up with Tech and More
07:03 MIN
The strategic value of hiring freelancers in the AI era
WeAreDevelopers LIVE – AI, Freelancing, Keeping Up with Tech and More
14:06 MIN
Exploring the role and ethics of AI in gaming
Devs vs. Marketers, COBOL and Copilot, Make Live Coding Easy and more - The Best of LIVE 2025 - Part 3
Featured Partners
Related Videos
Your Next AI Needs 10,000 GPUs. Now What?
Anshul Jindal & Martin Piercy
Open Source AI, To Foundation Models and Beyond
Ankit Patel, Matt White, Philipp Schmid, Lucie-Aimée Kaffee & Andreas Blattmann
Supercharge your cloud-native applications with Generative AI
Cedric Clyburn
A Deep Dive on How To Leverage the NVIDIA GB200 for Ultra-Fast Training and Inference on Kubernetes
Kevin Klues
Efficient deployment and inference of GPU-accelerated LLMs
Adolf Hohl
Developer Experience, Platform Engineering and AI powered Apps
Ignacio Riesgo & Natale Vinto
WWC24 - Ankit Patel - Unlocking the Future Breakthrough Application Performance and Capabilities with NVIDIA
Ankit Patel
Bringing AI Everywhere
Stephan Gillich
Related Articles
View all articles



From learning to earning
Jobs that call for the skills explored in this talk.

Forschungszentrum Jülich GmbH
Jülich, Germany
Intermediate
Senior
Linux
Docker
AI Frameworks
Machine Learning

BWI GmbH
Berlin, Germany
Senior
Linux
DevOps
Python
Ansible
Terraform
+2

BWI GmbH
München, Germany
Senior
Linux
DevOps
Python
Ansible
Terraform
+1

Scalable GmbH
Berlin, Germany
API
Data analysis
Microservices
Agile Methodologies

OpenAI
München, Germany
Senior
API
Python
JavaScript
Machine Learning

Scalable GmbH
München, Germany
API
Data analysis
Microservices
Agile Methodologies

Ilionx
Groningen, Netherlands
€4-6K
Intermediate
Azure
PySpark

