Tanmay Bakshi
What do language models really learn
#1about 7 minutes
The fundamental challenge of modeling natural language
Language models aim to create intuitive human-computer interfaces, but this is difficult because language syntax doesn't fully capture semantic meaning.
#2about 3 minutes
How deep learning models learn by transforming data
Deep learning works by performing a series of transformations on input data to warp its vector space until it becomes linearly separable.
#3about 3 minutes
Why the training objective is key to model behavior
The training objective, or incentive, dictates exactly what a model learns and can lead to unintended outcomes if not designed carefully.
#4about 8 minutes
From Word2Vec and LSTMs to modern transformers
The evolution from slow, non-contextual models like LSTMs to the parallel and deeply contextual transformer architecture solved major NLP challenges.
#5about 7 minutes
A practical demo of a character-level BERT model
A scaled-down, character-level transformer model demonstrates the 'fill in the blank' pre-training task by predicting masked characters in artist names.
#6about 2 minutes
What language models implicitly learn about language structure
By analyzing a model's internal weights, we can see it learns phonetic relationships and syntactic structures without ever being explicitly trained on them.
#7about 7 minutes
Why current generative models don't truly 'write'
Generative models like GPT are excellent at predicting the next word based on statistical patterns but lack the underlying thought process required for true, creative writing.
#8about 4 minutes
Exploring the future with Blank Language Models
Blank Language Models (BLM) offer a new training approach by filling in text in any order, forcing the model to consider both past and future context.
#9about 3 minutes
The need for better tooling to accelerate ML research
The complexity of implementing novel architectures like BLMs highlights the need for better infrastructure and compiled languages like Swift for TensorFlow to speed up innovation.
Related jobs
Jobs that call for the skills explored in this talk.
Picnic Technologies B.V.
Amsterdam, Netherlands
Intermediate
Senior
Python
Structured Query Language (SQL)
+1
Wilken GmbH
Ulm, Germany
Senior
Kubernetes
AI Frameworks
+3
Matching moments
02:20 MIN
The evolving role of the machine learning engineer
AI in the Open and in Browsers - Tarek Ziadé
03:55 MIN
The hardware requirements for running LLMs locally
AI in the Open and in Browsers - Tarek Ziadé
04:59 MIN
Unlocking LLM potential with creative prompting techniques
WeAreDevelopers LIVE – Frontend Inspirations, Web Standards and more
09:10 MIN
How AI is changing the freelance developer experience
WeAreDevelopers LIVE – AI, Freelancing, Keeping Up with Tech and More
04:28 MIN
Building an open source community around AI models
AI in the Open and in Browsers - Tarek Ziadé
01:02 MIN
AI lawsuits, code flagging, and self-driving subscriptions
Fake or News: Self-Driving Cars on Subscription, Crypto Attacks Rising and Working While You Sleep - Théodore Lefèvre
05:09 MIN
Why specialized models outperform generalist LLMs
AI in the Open and in Browsers - Tarek Ziadé
07:43 MIN
Writing authentic content in the age of LLMs
Slopquatting, API Keys, Fun with Fonts, Recruiters vs AI and more - The Best of LIVE 2025 - Part 2
Featured Partners
Related Videos
Multimodal Generative AI Demystified
Ekaterina Sirazitdinova
Creating Industry ready solutions with LLM Models
Vijay Krishan Gupta & Gauravdeep Singh Lotey
AI: Superhero or Supervillain? How and Why with Scott Hanselman
Scott Hanselman
A beginner’s guide to modern natural language processing
Jodie Burchell
How AI Models Get Smarter
Ankit Patel
Inside the Mind of an LLM
Emanuele Fabbiani
Lies, Damned Lies and Large Language Models
Jodie Burchell
Data Privacy in LLMs: Challenges and Best Practices
Aditi Godbole
Related Articles
View all articles.png?w=240&auto=compress,format)


.png?w=240&auto=compress,format)
From learning to earning
Jobs that call for the skills explored in this talk.

Xablu
Hengelo, Netherlands
Intermediate
.NET
Python
PyTorch
Blockchain
TensorFlow
+3

Apple Inc.
Cambridge, United Kingdom
C++
Java
Bash
Perl
Python
+4

Barcelona Supercomputing Center
Barcelona, Spain
Intermediate
Python
PyTorch
Machine Learning

Luminance Technologies
Cambridge, United Kingdom
Python
PyTorch
TensorFlow
Computer Vision
Machine Learning
+1

Paris-based
Paris, France
Python
Docker
TensorFlow
Kubernetes
Computer Vision
+2

Deloitte
Leipzig, Germany
Azure
DevOps
Python
Docker
PyTorch
+6

European Tech Recruit
Municipality of Zaragoza, Spain
Junior
Python
Docker
PyTorch
Computer Vision
Machine Learning
+1

Language Services Ltd
Glasgow, United Kingdom
Remote
£75-100K
Senior
Machine Learning
Microsoft Dynamics

Neural Concept
Großmehring, Germany
Fluid
Python
Machine Learning