Roberto Carratalá

One AI API to Power Them All

Stop wrestling with fragmented AI tools. Go from local development to a production cluster with one unified API for inference, RAG, and agents.

One AI API to Power Them All
#1about 5 minutes

The challenge of building production-ready AI applications

The current AI landscape is fragmented with many tools, making it complex to build, scale, and maintain applications with features like RAG and agents.

#2about 3 minutes

Introducing Llama Stack for a unified AI API

Llama Stack, an open-source project from Meta, provides a standardized, modular framework to simplify AI development with a single API for various components.

#3about 3 minutes

Standardizing model inference and safety guardrails

Llama Stack abstracts away differences between local and remote LLMs and integrates safety shields to filter harmful inputs and outputs.

#4about 2 minutes

Simplifying retrieval-augmented generation (RAG) pipelines

Llama Stack organizes the complex RAG process into three distinct, swappable layers for vector embeddings, retrieval, and agentic workflows.

#5about 4 minutes

Building AI agents using the Model Context Protocol

Llama Stack simplifies agent creation by integrating tools, orchestration, and reasoning models through the standardized Model Context Protocol (MCP).

#6about 3 minutes

Gaining application observability with built-in telemetry

Llama Stack provides out-of-the-box telemetry using OpenTelemetry, enabling developers to trace multi-step agent workflows with tools like Jaeger.

#7about 4 minutes

A local demo of inference, safety, and agents

This live demo showcases running Llama Stack locally to perform inference, block unsafe prompts, use an agent to check the weather, and inspect traces in Jaeger.

#8about 1 minute

Transitioning AI applications from local to production

Llama Stack enables a seamless transition from a local development setup to a scalable production environment on Kubernetes by maintaining a consistent API.

#9about 5 minutes

A production demo of a multi-agent business workflow

A complex agent interacts with multiple MCP servers to query a CRM, analyze customer data, send Slack notifications, and generate a PDF report.

Related jobs
Jobs that call for the skills explored in this talk.

Featured Partners

Related Articles

View all articles
CH
Chris Heilmann
With AIs wide open - WeAreDevelopers at All Things Open 2025
Last week our VP of Developer Relations, Chris Heilmann, flew to Raleigh, North Carolina to present at All Things Open . An excellent event he had spoken at a few times in the past and this being the “Lucky 13” edition, he didn’t hesitate to come and...
With AIs wide open - WeAreDevelopers at All Things Open 2025

From learning to earning

Jobs that call for the skills explored in this talk.

AI Engineer

AI Engineer

StackOne
Charing Cross, United Kingdom

Remote
77K
Python
TypeScript
Amazon Web Services (AWS)