Roberto Carratalá & Cedric Clyburn

Self-Hosted LLMs: From Zero to Inference

Stop sending your private data to third-party AI. This talk shows you how to self-host powerful language models for complete control and security.

Self-Hosted LLMs: From Zero to Inference
#1about 3 minutes

The rise of self-hosted open source AI models

Self-hosting large language models offers developers greater privacy, cost savings, and control compared to third-party cloud AI services.

#2about 2 minutes

Key benefits of local LLM deployment for developers

Running models locally improves the development inner loop, provides full data privacy, and allows for greater customization and control over the AI stack.

#3about 3 minutes

Comparing open source tools for serving LLMs

Explore different open source tools like Ollama for local development, vLLM for scalable production, and Podman AI Lab for containerized AI applications.

#4about 3 minutes

How to select the right open source LLM

Navigate the vast landscape of open source models by understanding different model families, their specific use cases, and naming conventions.

#5about 3 minutes

Using quantization to run large models locally

Model quantization compresses LLMs to reduce their memory footprint, enabling them to run efficiently on consumer hardware like laptops with CPUs or GPUs.

#6about 1 minute

Strategies for integrating local LLMs with your data

Learn three key methods for connecting local models to your data: Retrieval-Augmented Generation (RAG), local code assistants, and building agentic applications.

#7about 6 minutes

Demo: Building a RAG system with local models

Use Podman AI Lab to serve a local LLM and connect it to AnythingLLM to create a question-answering system over your private documents.

#8about 5 minutes

Demo: Setting up a local AI code assistant

Integrate a self-hosted LLM with the Continue VS Code extension to create a private, offline-capable AI pair programmer for code generation and analysis.

#9about 4 minutes

Demo: Building an agentic app with external tools

Create an agentic application that uses a local LLM with external tools via the Model Context Protocol (MCP) to perform complex, multi-step tasks.

#10about 1 minute

Conclusion and the future of open source AI

Self-hosting provides a powerful, private, and customizable alternative to third-party services, highlighting the growing potential of open source AI for developers.

Related jobs
Jobs that call for the skills explored in this talk.

Featured Partners

Related Articles

View all articles
CH
Chris Heilmann
With AIs wide open - WeAreDevelopers at All Things Open 2025
Last week our VP of Developer Relations, Chris Heilmann, flew to Raleigh, North Carolina to present at All Things Open . An excellent event he had spoken at a few times in the past and this being the “Lucky 13” edition, he didn’t hesitate to come and...
With AIs wide open - WeAreDevelopers at All Things Open 2025
LM
Luis Minvielle
What Are Large Language Models?
Developers and writers can finally agree on one thing: Large Language Models, the subset of AIs that drive ChatGPT and its competitors, are stunning tech creations. Developers enjoying the likes of GitHub Copilot know the feeling: this new kind of te...
What Are Large Language Models?
DC
Daniel Cranney
Stephan Gillich - Bringing AI Everywhere
In the ever-evolving world of technology, AI continues to be the frontier for innovation and transformation. Stephan Gillich, from the AI Center of Excellence at Intel, dove into the subject in a recent session titled "Bringing AI Everywhere," sheddi...
Stephan Gillich - Bringing AI Everywhere

From learning to earning

Jobs that call for the skills explored in this talk.