Andreas Erben

Aug 20, 2025 • World Congress 2025

You are not my model anymore - understanding LLM model behavior

Your LLM is a shoggoth with a smiley face mask. Learn what happens when the mask slips and your application breaks.

#1about 2 minutes

Unexpected LLM behavior from hidden platform updates

A practical demonstration shows how a cloud provider's content filter update can unexpectedly block access to documents, causing application failures.

#2about 3 minutes

How LLMs generate text and learn behavior

Large language models use a transformer architecture to predict the next token based on probability, with instruction tuning and alignment shaping their final behavior.

#3about 2 minutes

The opaque and complex stack of modern LLM services

Major LLM providers operate in secrecy, and the full technology stack from model weights to the API is complex, leaving developers with limited visibility and control.

#4about 3 minutes

Managing risks from provider filters and short API lifecycles

Cloud provider content filters can change without notice, creating vulnerabilities, while the short lifecycle of model APIs requires constant adaptation.

#5about 4 minutes

Understanding LLMs as alien minds with fragile alignment

LLMs are conceptually like alien intelligences with a fragile, human-like alignment layer that can be bypassed by jailbreaks exploiting internal model circuits.

#6about 2 minutes

How model personalities and behaviors shift between versions

Different LLM versions exhibit distinct behaviors and may ignore system prompts, as shown by a comparison between GPT-4 and a newer reasoning model.

#7about 3 minutes

Using evaluations to systematically test model behavior

Systematically test model behavior using evaluations, which can be automated by generating prompt variations or using pre-built cloud and open-source frameworks.

#8about 4 minutes

Using prompt engineering to mitigate model drift

Mitigate model behavior drift by using advanced prompt engineering techniques like forcing reasoning, providing few-shot examples, and being highly explicit in instructions.

Sunhat
Köln, Germany

Remote

€65-95K

Senior

TypeScript

REST

+1

Wilken GmbH
Ulm, Germany

Senior

Amazon Web Services (AWS)

Kubernetes

+1

ZEISS Group
Oberkochen, Germany

Intermediate

Python

Azure

Analyzing the risks and architecture of current AI models

04:34 MIN

Analyzing the risks and architecture of current AI models

Opening Keynote by Sir Tim Berners-Lee

Shifting from traditional code to AI-powered logic

02:58 MIN

Shifting from traditional code to AI-powered logic

WWC24 - Ankit Patel - Unlocking the Future Breakthrough Application Performance and Capabilities with NVIDIA

Addressing the core challenges of large language models

05:18 MIN

Addressing the core challenges of large language models

Accelerating GenAI Development: Harnessing Astra DB Vector Store and Langflow for LLM-Powered Apps

The ethical risks of outdated and insecure AI models

02:19 MIN

The ethical risks of outdated and insecure AI models

AI & Ethics

Understanding the GenAI lifecycle and its operational challenges

05:39 MIN

Understanding the GenAI lifecycle and its operational challenges

LLMOps-driven fine-tuning, evaluation, and inference with NVIDIA NIM & NeMo Microservices

Addressing the key challenges of large language models

02:55 MIN

Addressing the key challenges of large language models

Large Language Models ❤️ Knowledge Graphs

The challenge of moving AI from demo to production

03:18 MIN

The challenge of moving AI from demo to production

What’s New with Google Gemini?

AI privacy concerns and prompt engineering

03:43 MIN

AI privacy concerns and prompt engineering

Coffee with Developers - Cassidy Williams -

Featured Partners

Three years of putting LLMs into Software - Lessons learned

Three years of putting LLMs into Software - Lessons learned

Simon A.T. Jiménez

about 6 months ago • World Congress 2025

Beyond the Hype: Building Trustworthy and Reliable LLM Applications with Guardrails

Beyond the Hype: Building Trustworthy and Reliable LLM Applications with Guardrails

Alex Soto

about 6 months ago • World Congress 2025

AI: Superhero or Supervillain? How and Why with Scott Hanselman

AI: Superhero or Supervillain? How and Why with Scott Hanselman

Scott Hanselman

about 2 years ago • World Congress 2024

How AI Models Get Smarter

How AI Models Get Smarter

Ankit Patel

about 7 months ago • World Congress 2025

Prompt Injection, Poisoning & More: The Dark Side of LLMs

Prompt Injection, Poisoning & More: The Dark Side of LLMs

Keno Dreßel

about 6 months ago • World Congress 2025

Inside the Mind of an LLM

Inside the Mind of an LLM

Emanuele Fabbiani

about 6 months ago • World Congress 2025

From Traction to Production: Maturing your GenAIOps step by step

From Traction to Production: Maturing your GenAIOps step by step

Maxim Salnikov

about 6 months ago • World Congress 2025

Bringing the power of AI to your application.

Bringing the power of AI to your application.

Krzysztof Cieślak

about 2 years ago • World Congress 2024

Related Articles

View all articles

DC

Daniel Cranney

Dev Digest 210: AI Agents Are Go! Is MCP Dead? LLMs Crack Anonymity

Inside last week’s Dev Digest 210 . 🪦 Is MCP already dead? 🐍 Secure snake on the CLI 🏗️ The architecture behind open source LLMs ⚖️ AI companies and governments at odds 🦫 Is Go the best language for AI agents? 🕵️ “Security research” bot hacks Micros...

Dev Digest 210: AI Agents Are Go! Is MCP Dead? LLMs Crack Anonymity

BB

Benedikt Bischof

MLops – Deploying, Maintaining And Evolving Machine Learning Models in Production

Welcome to this issue of the WeAreDevelopers Live Talk series. This article recaps an interesting talk by Bas Geerdink who gave advice on MLOps.‍About the speaker:‍Bas is a programmer, scientist, and IT manager. At ING, he is responsible for the Fast...

MLops – Deploying, Maintaining And Evolving Machine Learning Models in Production

DC

Daniel Cranney

Dev Digest 196: AI Killed DevOps, LLM Political Bias & AI Security

Inside last week’s Dev Digest 196 . ⚖️ Political bias in LLMs 🫣 AI written code causes 1 in 5 security breaches 🖼️ Is there a limit to alternative text on images? 📝 CodeWiki - understand code better 🟨 Long tasks in JavaScript 👻 Scare yourself into n...

Dev Digest 196: AI Killed DevOps, LLM Political Bias & AI Security

DC

Daniel Cranney

Panel Discussion: Responsible AI in Practice - Real-World Examples and Challenges

IntroductionIn the ever-evolving landscape of artificial intelligence, the concept of "responsible AI" has emerged as a cornerstone for ethical and practical AI implementation. During the WWC24 Panel discussion, three eminent experts—Mina, Bjorn Brin...

Panel Discussion: Responsible AI in Practice - Real-World Examples and Challenges

From learning to earning

Jobs that call for the skills explored in this talk.

AI Backend Developer (m/w/d)

MUUUH! GmbH
Osnabrück, Germany

Intermediate

Java

Python

TypeScript

AI Engineer Bootcamp Instructor (ML, DL, MLOps & LLM Systems) - Onsite&Remote

WeCloudData

Remote

Python

Machine Learning

Continuous Integration

AI/ML Engineer ( Azure Platform Modernization )

LinkiT
Amsterdam, Netherlands

Azure

DevOps

Python

PySpark

Terraform

+2

Ai / ML Engineers

Old Bailey

Continuous Integration

AI Specialist with Azure

Langbourn

Remote

Azure

AI Software Engineer - Model Evaluation

Aleph Alpha
Heidelberg, Germany

PyTorch

AI/ Machine Learning Engineer (NLP / LLM)

Ai-powered
Peterborough, United Kingdom

Remote

Senior

Machine Learning

Natural Language Processing

AI Consultant / AI Solution Architect

LEOGY GmbH
Brunswick, Germany

Senior AI Platform Backend Engineer (LLM)

IT Partner España

Remote

API

NLTK

Azure

Scrum

+13