Dainius Jocas
Don't Change the Partition Count for Kafka Topics!
#1about 5 minutes
An overview of the data indexing pipeline architecture
The system moves data from a MySQL primary data store to an Elasticsearch search server using a Kafka and Kafka Connect pipeline.
#2about 1 minute
Using Kafka partition offset for optimistic concurrency control
The system leverages the Kafka partition offset as the document version number in Elasticsearch to enable parallel indexing without data consistency issues.
#3about 2 minutes
Investigating a mysterious data deletion failure in production
A bug report about Elasticsearch failing to delete documents, which serves stale data, could not be reproduced in local or testing environments.
#4about 5 minutes
Discovering the offset and version number mismatch
Manual inspection reveals that the document version in Elasticsearch is significantly higher than the new message offset in the Kafka topic for the same key.
#5about 4 minutes
How changing partition count breaks message ordering guarantees
Increasing the Kafka topic's partition count changes the key hashing algorithm, causing new messages for the same key to land in different partitions with lower offsets.
#6about 4 minutes
The solution and key lessons for managing Kafka topics
The fix required a full data re-ingestion into a new Kafka topic, highlighting the lesson to never increase partition count when message ordering is critical.
Related jobs
Jobs that call for the skills explored in this talk.
Picnic Technologies B.V.
Amsterdam, Netherlands
Senior
Java
Amazon Web Services (AWS)
+1
Matching moments
01:15 MIN
Crypto crime, EU regulation, and working while you sleep
Fake or News: Self-Driving Cars on Subscription, Crypto Attacks Rising and Working While You Sleep - Théodore Lefèvre
01:06 MIN
Malware campaigns, cloud latency, and government IT theft
Fake or News: Self-Driving Cars on Subscription, Crypto Attacks Rising and Working While You Sleep - Théodore Lefèvre
02:39 MIN
Establishing a single source of truth for all data
Cat Herding with Lions and Tigers - Christian Heilmann
04:57 MIN
Increasing the value of talk recordings post-event
Cat Herding with Lions and Tigers - Christian Heilmann
01:32 MIN
Organizing a developer conference for 15,000 attendees
Cat Herding with Lions and Tigers - Christian Heilmann
06:46 MIN
How AI-generated content is overwhelming open source maintainers
WeAreDevelopers LIVE – You Don’t Need JavaScript, Modern CSS and More
06:01 MIN
Navigating cultural shifts during rapid growth and investment
From Data Keeper to Culture Shaper: The Evolution of HR Across Growth Stages
05:32 MIN
Getting hired by contributing to open source projects
Devs vs. Marketers, COBOL and Copilot, Make Live Coding Easy and more - The Best of LIVE 2025 - Part 3
Featured Partners
Related Videos
Practical Change Data Streaming Use Cases With Debezium And Quarkus
Alex Soto
Tips, Techniques, and Common Pitfalls Debugging Kafka
DeveloperSteve
Let's Get Started With Apache Kafka® for Python Developers
Lucia Cerchie
How to Benchmark Your Apache Kafka
Kirill Kulikov
Kafka Streams Microservices
Denis Washington & Olli Salonen
Distributed search under the hood
Alexander Reelsen
From event streaming to event sourcing 101
Gerard Klijs
Single Server, Global Reach: Running a Worldwide Marketplace on Bare Metal in a Cloud-Dominated World
Jens Happe
Related Articles
View all articles



From learning to earning
Jobs that call for the skills explored in this talk.

Digital Talent Agency
Barcelona, Spain
Senior
Bash
Azure
Kafka
Python
Docker
+5


Rigobeert Cremers
Ghent, Belgium
Intermediate
API
Java
Azure
Kafka
Docker
+5

Krell Consulting & Training
Municipality of Madrid, Spain
Spark
Data Lake
Elasticsearch

Gelderland Utrecht Overijssel Drenthe Zeeland Flevoland
Amsterdam, Netherlands
Senior
Kafka
Azure
Unit Testing

Revolut Ltd
Municipality of Madrid, Spain
Remote
€86-96K
API
Linux
MySQL
+15

Antal International
Nederland, Netherlands
Senior
Java
NoSQL
Spark
Kafka
Amazon Web Services (AWS)

