Alex Soto & Markus Eisele
RAG like a hero with Docling
#1about 3 minutes
Using RAG to enrich LLMs with proprietary data
Retrieval-augmented generation (RAG) is the key to making large language models useful for enterprises by providing them with up-to-date, proprietary information.
#2about 4 minutes
The challenge of parsing complex document structures
Simple document parsers can misinterpret layouts like multi-column text, leading to corrupted data and incorrect outputs from the language model.
#3about 3 minutes
Using Docling to convert documents into structured formats
Docling is an open-source tool that acts like an advanced OCR service, converting various binary document formats into a structured, parsable tree.
#4about 7 minutes
Demo of a basic RAG ingestion pipeline
A live demonstration shows how a Quarkus application uses Docling to ingest a PDF, generate embeddings, and store the resulting chunks and vectors in Redis.
#5about 3 minutes
Securing RAG against data poisoning and leaks
To prevent data poisoning and sensitive data leaks, it is crucial to sanitize documents, verify their signatures, and use tools for PII masking.
#6about 4 minutes
Mitigating vector store attacks and encryption challenges
Vector stores are vulnerable to attacks like close vector modification and reversal, and standard encryption breaks vector distance, requiring specialized solutions.
#7about 5 minutes
Demo of a secure ingestion pipeline in action
A final demonstration showcases a secure pipeline that verifies document signatures, anonymizes sensitive data, and encrypts vectors before storing them.
Related jobs
Jobs that call for the skills explored in this talk.
Matching moments
23:59 MIN
A deep dive into retrieval-augmented generation
Lies, Damned Lies and Large Language Models
15:49 MIN
Understanding retrieval-augmented generation (RAG)
Exploring LLMs across clouds
24:04 MIN
Demo: Implementing RAG with LangChain4J and a vector database
Langchain4J - An Introduction for Impatient Developers
15:55 MIN
Visualizing the end-to-end RAG architecture
Building Blocks of RAG: From Understanding to Implementation
25:53 MIN
Addressing unique security risks in RAG systems
Beyond the Hype: Building Trustworthy and Reliable LLM Applications with Guardrails
39:05 MIN
Code walkthrough for building a RAG-based chatbot
Creating Industry ready solutions with LLM Models
13:21 MIN
Implementing retrieval-augmented generation for documents
Semantic AI: Why Embeddings Might Matter More Than LLMs
18:42 MIN
Building an on-device RAG solution for PDFs
From ML to LLM: On-device AI in the Browser
Featured Partners
Related Videos
Carl Lapierre - Exploring Advanced Patterns in Retrieval-Augmented Generation
Carl Lapierre
Building Blocks of RAG: From Understanding to Implementation
Ashish Sharma
Accelerating GenAI Development: Harnessing Astra DB Vector Store and Langflow for LLM-Powered Apps
Dieter Flick & Michel de Ru
Build RAG from Scratch
Phil Nash
Large Language Models ❤️ Knowledge Graphs
Michael Hunger
Beyond the Hype: Building Trustworthy and Reliable LLM Applications with Guardrails
Alex Soto
Building AI Applications with LangChain and Node.js
Julián Duque
Langchain4J - An Introduction for Impatient Developers
Juarez Junior
Related Articles
View all articles



From learning to earning
Jobs that call for the skills explored in this talk.

AI Systems and MLOps Engineer for Earth Observation
Forschungszentrum Jülich GmbH
Jülich, Germany
Intermediate
Senior
Linux
Docker
AI Frameworks
Machine Learning


Agentic AI Architect - Python, LLMs & NLP
FRG Technology Consulting
Intermediate
Azure
Python
Machine Learning

R&D AI Software Engineer / End-to-End Machine Learning Engineer / RAG and LLM
Pathway
Paris, France
Remote
€72-75K
GIT
Python
Unit Testing
+2



LLM-AI Engineer | Python | Arquitecturas RAG (100% remoto)
Diverger
Municipality of Bilbao, Spain
Azure
Python
Amazon Web Services (AWS)

LLM-AI Engineer | Python | Arquitecturas RAG (100% remoto)
Diverger
Retortillo de Soria, Spain
Azure
Python
Amazon Web Services (AWS)

Software Engineer - SDLC Security - Public Artifacts
Datadog
Paris, France
DevOps
Python
Kubernetes
Configuration Management