Alex Soto & Markus Eisele

Aug 20, 2025 • World Congress 2025

RAG like a hero with Docling

Your RAG pipeline has security holes you haven't considered. Learn to defend against data poisoning and a new class of vector store attacks.

#1about 3 minutes

Using RAG to enrich LLMs with proprietary data

Retrieval-augmented generation (RAG) is the key to making large language models useful for enterprises by providing them with up-to-date, proprietary information.

#2about 4 minutes

The challenge of parsing complex document structures

Simple document parsers can misinterpret layouts like multi-column text, leading to corrupted data and incorrect outputs from the language model.

#3about 3 minutes

Using Docling to convert documents into structured formats

Docling is an open-source tool that acts like an advanced OCR service, converting various binary document formats into a structured, parsable tree.

#4about 7 minutes

Demo of a basic RAG ingestion pipeline

A live demonstration shows how a Quarkus application uses Docling to ingest a PDF, generate embeddings, and store the resulting chunks and vectors in Redis.

#5about 3 minutes

Securing RAG against data poisoning and leaks

To prevent data poisoning and sensitive data leaks, it is crucial to sanitize documents, verify their signatures, and use tools for PII masking.

#6about 4 minutes

Mitigating vector store attacks and encryption challenges

Vector stores are vulnerable to attacks like close vector modification and reversal, and standard encryption breaks vector distance, requiring specialized solutions.

#7about 5 minutes

Demo of a secure ingestion pipeline in action

A final demonstration showcases a secure pipeline that verifies document signatures, anonymizes sensitive data, and encrypts vectors before storing them.

ROSEN Technology and Research Center GmbH
Osnabrück, Germany

Senior

TypeScript

React

+3

Wilken GmbH
Ulm, Germany

Senior

Kubernetes

AI Frameworks

+3

VECTOR Informatik
Stuttgart, Germany

Senior

Java

IT Security

Prompt injection as an unsolved AI security problem

07:39 MIN

Prompt injection as an unsolved AI security problem

AI in the Open and in Browsers - Tarek Ziadé

Crypto crime, EU regulation, and working while you sleep

01:15 MIN

Crypto crime, EU regulation, and working while you sleep

Fake or News: Self-Driving Cars on Subscription, Crypto Attacks Rising and Working While You Sleep - Théodore Lefèvre

Increasing the value of talk recordings post-event

04:57 MIN

Increasing the value of talk recordings post-event

Cat Herding with Lions and Tigers - Christian Heilmann

Using AI to overcome challenges in systems programming

02:49 MIN

Using AI to overcome challenges in systems programming

AI in the Open and in Browsers - Tarek Ziadé

Malware campaigns, cloud latency, and government IT theft

01:06 MIN

Malware campaigns, cloud latency, and government IT theft

Fake or News: Self-Driving Cars on Subscription, Crypto Attacks Rising and Working While You Sleep - Théodore Lefèvre

How AI threatens the open source documentation business model

08:29 MIN

How AI threatens the open source documentation business model

WeAreDevelopers LIVE – AI, Freelancing, Keeping Up with Tech and More

The security risks of AI-generated code and slopsquatting

05:55 MIN

The security risks of AI-generated code and slopsquatting

Slopquatting, API Keys, Fun with Fonts, Recruiters vs AI and more - The Best of LIVE 2025 - Part 2

Using AI agents to modernize legacy COBOL systems

06:28 MIN

Using AI agents to modernize legacy COBOL systems

Devs vs. Marketers, COBOL and Copilot, Make Live Coding Easy and more - The Best of LIVE 2025 - Part 3

Featured Partners

Carl Lapierre - Exploring Advanced Patterns in Retrieval-Augmented Generation

Carl Lapierre - Exploring Advanced Patterns in Retrieval-Augmented Generation

Carl Lapierre

about a year ago • World Congress 2024

Building Blocks of RAG: From Understanding to Implementation

Building Blocks of RAG: From Understanding to Implementation

Ashish Sharma

about a year ago • WeAreDevelopers LIVE

Accelerating GenAI Development: Harnessing Astra DB Vector Store and Langflow for LLM-Powered Apps

Accelerating GenAI Development: Harnessing Astra DB Vector Store and Langflow for LLM-Powered Apps

Dieter Flick & Michel de Ru

about a year ago • World Congress 2024

Build RAG from Scratch

Build RAG from Scratch

Phil Nash

about a year ago • World Congress 2024

Large Language Models ❤️ Knowledge Graphs

Large Language Models ❤️ Knowledge Graphs

Michael Hunger

about a year ago • World Congress 2024

Beyond the Hype: Building Trustworthy and Reliable LLM Applications with Guardrails

Beyond the Hype: Building Trustworthy and Reliable LLM Applications with Guardrails

Alex Soto

about 4 months ago • World Congress 2025

Building AI Applications with LangChain and Node.js

Building AI Applications with LangChain and Node.js

Julián Duque

about 4 months ago • World Congress 2025

Langchain4J - An Introduction for Impatient Developers

Langchain4J - An Introduction for Impatient Developers

Juarez Junior

about a year ago • World Congress 2024

Related Articles

View all articles

DC

Daniel Cranney

Slopquatting, API Keys, Fun with Fonts, Recruiters vs AI and more - The Best of LIVE 2025 - Part 2

This week, we’re continuing our look-back on some of the best moments from the Weekly Developer Show from 2025. Here’s what some of our fantastic guests had to say… Sebastian Gingter cracked open the idea of “slopsquatting” and explained why we shou...

Slopquatting, API Keys, Fun with Fonts, Recruiters vs AI and more - The Best of LIVE 2025 - Part 2

DC

Daniel Cranney

Dev Digest 201: Don't Stop Thinking, AI Slop vs. OSS Security, Rank Things

Inside last week’s Dev Digest 201 . 🧠 Despite AI you still need to think 🍋 Bitter lessons from building AI products 🤖 AI Slop vs. OSS security 📱 Cloning tap-to-pay on Android 🤑 Saving $500k/year by re-inventing S3 📄 AI reads manuals 🎥 Automating FFM...

Dev Digest 201: Don't Stop Thinking, AI Slop vs. OSS Security, Rank Things

CH

Chris Heilmann

Dev Digest 138 - Are you secure about this?

Hello there! This is the 2nd "out of the can" edition of 3 as I am on vacation in Greece eating lovely things on the beach. So, fewer news, but lots of great resources. Many around the topic of security. Enjoy! News and ArticlesGoogle Pixel phones t...

Dev Digest 138 - Are you secure about this?

DC

Daniel Cranney

Dev Digest 198: 30 years of JS, In-Browser AI, How Attackers Abuse GenAI

Inside last week’s Dev Digest 198 . 🎂 30 years of JavaScript ⏰ How long is a JavaScript second 💻 Clean code in Angular 🤦‍♂️ AI makes different mistakes than humans 👨‍💻 In-browser and offline AI 🟠 Undocumented Hacker News features 🐋 DeepSeek censored...

Dev Digest 198: 30 years of JS, In-Browser AI, How Attackers Abuse GenAI

From learning to earning

Jobs that call for the skills explored in this talk.

AI Systems and MLOps Engineer for Earth Observation

Forschungszentrum Jülich GmbH
Jülich, Germany

Intermediate

Senior

Linux

Docker

AI Frameworks

Machine Learning

Software Engineer - RAG, Knowledge Graphs & Agentic Systems

Riverty GmbH
Verl, Germany

Remote

Java

Python

TypeScript

Software Engineer - RAG, Knowledge Graphs & Agentic Systems

Riverty GmbH
Berlin, Germany

Remote

Java

Python

TypeScript

AI Software Engineer | Python | RAG | Retrieval Augmented Generation | DAG | Dagster | London, UK

The Rolewe
Charing Cross, United Kingdom

API

Python

Machine Learning

Software Engineer - KI & Retrieval (RAG/Azure)

Jurafuchs
Berlin, Germany

Remote

API

Azure

Python

Node.js

+4

Software Engineer - KI & Retrieval (RAG/Azure)

Jurafuchs
Berlin, Germany

Remote

API

Azure

Python

Node.js

+4

Senior AI Engineer - LLMs & Agentic Systems (all genders)

Robert Ragge GmbH

Senior

API

Python

Terraform

Kubernetes

A/B testing

+3

AI Engineer (GraphRAG, RAG )

Tecdata
Municipality of Madrid, Spain

Azure

Neo4j

Amazon Web Services (AWS)

Machine Learning Engineer

DocuWare GmbH
Illingen, Germany

Python

PyTorch

Machine Learning