Andrii Raikov

How Regex Works: The Secret Sauce Behind Pattern Matching

What if you could build a regex engine from scratch? Learn how to translate any pattern into a simple state machine for lightning-fast matching.

How Regex Works: The Secret Sauce Behind Pattern Matching
#1about 4 minutes

Solving a LeetCode regex matching challenge

A LeetCode problem requiring a custom regex matcher for the dot and asterisk characters serves as the motivation for exploring how regex engines work.

#2about 2 minutes

Understanding the scalability limits of a naive approach

The initial linked-list implementation is not extensible enough to handle complex patterns like a real-world email validation regex.

#3about 3 minutes

How regex works using finite automata

Regular expressions can be modeled as finite automata, or state machines, which describe a set of character strings through states and transitions.

#4about 2 minutes

Comparing deterministic and non-deterministic finite automata

Non-deterministic Finite Automata (NFA) are more flexible for representing regex because a single input can lead to multiple possible next states.

#5about 5 minutes

Building an NFA from regular expression components

A complex regular expression is converted into an NFA by combining smaller NFA components for concatenation, alternation, and repetition operators.

#6about 2 minutes

The performance pitfalls of the backtracking algorithm

The backtracking approach, where the engine guesses a path and reverts on failure, can lead to exponential time complexity for complex patterns.

#7about 1 minute

Simulating an NFA by tracking all possible states

A more efficient simulation method avoids backtracking by simultaneously tracking the entire set of possible states the machine could be in at any point.

#8about 5 minutes

Implementing a simple NFA regex matcher in Go

A practical implementation involves compiling the regex pattern into an NFA data structure and then iterating through the input string while tracking the current set of active states.

#9about 2 minutes

Beyond one algorithm for real-world regex engines

Production-grade regex engines often use a hybrid approach, selecting from different algorithms like NFA simulation or backtracking based on the specific pattern.

Related jobs
Jobs that call for the skills explored in this talk.
tree-IT GmbH

tree-IT GmbH
Bad Neustadt an der Saale, Germany

54-80K
Intermediate
Senior
Java
TypeScript
+1

Featured Partners

Related Articles

View all articles
DC
Daniel Cranney
Dev Digest 198: 30 years of JS, In-Browser AI, How Attackers Abuse GenAI
Inside last week’s Dev Digest 198 . 🎂 30 years of JavaScript ⏰ How long is a JavaScript second 💻 Clean code in Angular 🤦‍♂️ AI makes different mistakes than humans 👨‍💻 In-browser and offline AI 🟠 Undocumented Hacker News features 🐋 DeepSeek censored...
Dev Digest 198: 30 years of JS, In-Browser AI, How Attackers Abuse GenAI

From learning to earning

Jobs that call for the skills explored in this talk.

Software Developer

Code Healers LLC
Hinesville, United States of America

Remote
20-30K
Junior
Intermediate
React
JavaScript
TypeScript
+1
Software Developers

Code Healers LLC
Hinesville, United States of America

Remote
30-40K
Intermediate
Senior
.NET
React
JavaScript
+2