Andreas Erben
You are not my model anymore - understanding LLM model behavior
#1about 2 minutes
Unexpected LLM behavior from hidden platform updates
A practical demonstration shows how a cloud provider's content filter update can unexpectedly block access to documents, causing application failures.
#2about 3 minutes
How LLMs generate text and learn behavior
Large language models use a transformer architecture to predict the next token based on probability, with instruction tuning and alignment shaping their final behavior.
#3about 2 minutes
The opaque and complex stack of modern LLM services
Major LLM providers operate in secrecy, and the full technology stack from model weights to the API is complex, leaving developers with limited visibility and control.
#4about 3 minutes
Managing risks from provider filters and short API lifecycles
Cloud provider content filters can change without notice, creating vulnerabilities, while the short lifecycle of model APIs requires constant adaptation.
#5about 4 minutes
Understanding LLMs as alien minds with fragile alignment
LLMs are conceptually like alien intelligences with a fragile, human-like alignment layer that can be bypassed by jailbreaks exploiting internal model circuits.
#6about 2 minutes
How model personalities and behaviors shift between versions
Different LLM versions exhibit distinct behaviors and may ignore system prompts, as shown by a comparison between GPT-4 and a newer reasoning model.
#7about 3 minutes
Using evaluations to systematically test model behavior
Systematically test model behavior using evaluations, which can be automated by generating prompt variations or using pre-built cloud and open-source frameworks.
#8about 4 minutes
Using prompt engineering to mitigate model drift
Mitigate model behavior drift by using advanced prompt engineering techniques like forcing reasoning, providing few-shot examples, and being highly explicit in instructions.
Related jobs
Jobs that call for the skills explored in this talk.
Matching moments
09:55 MIN
Shifting from traditional code to AI-powered logic
WWC24 - Ankit Patel - Unlocking the Future Breakthrough Application Performance and Capabilities with NVIDIA
13:54 MIN
The ethical risks of outdated and insecure AI models
AI & Ethics
25:33 MIN
AI privacy concerns and prompt engineering
Coffee with Developers - Cassidy Williams -
09:43 MIN
The technical challenges of running LLMs in browsers
From ML to LLM: On-device AI in the Browser
20:05 MIN
The limitations and potential of AI models
Coffee with Developers - Cassidy Williams -
16:53 MIN
The danger of over-engineering with LLMs
Event-Driven Architecture: Breaking Conversational Barriers with Distributed AI Agents
00:03 MIN
The rapid adoption of LLMs outpaces security practices
ChatGPT, ignore the above instructions! Prompt injection attacks and how to avoid them.
27:27 MIN
Final thoughts on developer accountability and AI tooling
Vibe coding sucks! Long life to vibe coding: Hardening Applications for Production with GenAI
Featured Partners
Related Videos
Three years of putting LLMs into Software - Lessons learned
Simon A.T. Jiménez
Beyond the Hype: Building Trustworthy and Reliable LLM Applications with Guardrails
Alex Soto
AI: Superhero or Supervillain? How and Why with Scott Hanselman
Scott Hanselman
How AI Models Get Smarter
Ankit Patel
Prompt Injection, Poisoning & More: The Dark Side of LLMs
Keno Dreßel
Inside the Mind of an LLM
Emanuele Fabbiani
From Traction to Production: Maturing your GenAIOps step by step
Maxim Salnikov
Bringing the power of AI to your application.
Krzysztof Cieślak
Related Articles
View all articles.gif?w=240&auto=compress,format)



From learning to earning
Jobs that call for the skills explored in this talk.

AI Systems and MLOps Engineer for Earth Observation
Forschungszentrum Jülich GmbH
Jülich, Germany
Intermediate
Senior
Linux
Docker
AI Frameworks
Machine Learning

Machine Learning Engineer - Large Language Models (LLM) - Startup
Startup
Charing Cross, United Kingdom
PyTorch
Machine Learning

Manager of Machine Learning (LLM/NLP/Generative AI) - Visas Supported
European Tech Recruit
Municipality of Bilbao, Spain
Junior
GIT
Python
Docker
Computer Vision
Machine Learning
+2


AI Evaluation Data Scientist - AI/ML/LLM - (Hybrid) - Barcelona
European Tech Recruit
Barcelona, Spain
Intermediate
GIT
Python
Pandas
Docker
PyTorch
+2

Agentic AI Architect - Python, LLMs & NLP
FRG Technology Consulting
Intermediate
Azure
Python
Machine Learning

AI Evaluation Data Scientist - AI/ML/LLM - (Hybrid) - Madrid
European Tech Recruit
Municipality of Madrid, Spain
Intermediate
GIT
Python
Pandas
Docker
PyTorch
+2

AI & Embedded ML Engineer (Real-Time Edge Optimization)
autonomous-teaming
Canton of Toulouse-5, France
Remote
C++
GIT
Linux
Python
+1

AI & Embedded ML Engineer (Real-Time Edge Optimization)
autonomous-teaming
München, Germany
Remote
C++
GIT
Linux
Python
+1