Writing a Regex Engine in Python
arjoonn sharma (~theSage21) |
We write a super simple Regex Engine in python, illustrating along the way how and why regular expressions work.
We also mention why the default
re is so nice in some ways.
- Finite State Automatons (DFA, NFA)
- Internally what is happening when you write regex patterns
- Algorithms needed to write a regex engine
- How to extend your engine in the future
Did I mention this is in pure Python? Yep, this is pure Python.
- What would you get out of this?
- Learn how to compose regular expressions.
- Get a look behind the scenes of almost any regular expression engine.
- Understand which expressions would end up being expensive ones.
- This is not a programming tutorial.
- To keep things simple we only allow
concatenationoperators since others can be made up using them
- We only demonstrate the simplest of regular expressions and show how other complex ones may be made.
- This is not a talk setting up the case for ditching
re; instead offering explanations as to how
reand other regex engines basically take a string (expression) and turn that into a program which either accepts or rejects candidate strings.
- A mini regex tutorial if time allows.
- How to write a function in Python
- Python data types
Code for the engine is here:
I'm a Masters student at IIITM-K currently working on my Masters' Thesis. I love building things with Python, blogging, writing articles for OSFY and so on. I've been swimming in Python for about 5 years now, and absolutely love it.