i'm a phd candidate at taltech in estonia,
working at the high-assurance software laboratory. my research is mostly
about automata theory and string algorithms, with a focus on high-performance
regex engines, but also anything else to do with text and streams. if you
need to search through your email, process tons of documentation, or find that one file from
your computer, i'm your guy (!). i've been looking for
more industrial applications to my research post phd, so feel free to reach out if you
are looking for consulting or collaboration on anything related to my work.
research and projects
i work with my advisors Juhan Ernits (taltech) and Margus Veanes (microsoft
research) on
Derivative-Based Symbolic Automata, an incremental
approach to hard problems by pushing heavy computation into lazy
step-by-step unwinding of automata - i.e., solving only as much as needed,
when needed. we've applied this to regex matching, model checking, and
string constraint solving, with promising results.
the main project i've been working on for the past few years is RE#,
a high-performance regex engine in F# and Rust. RE# supports extended regex features
(lookarounds, intersection, complement) while being
fast and memory efficient thanks to derivative-based algorithms and optimizations.
some discoveries we made with RE# helped improve the performance of the .NET NonBacktracking regex engine, originally developed by my advisor Margus Veanes, as well as inspire other projects, like the
LLM tokenizer for the AICI project at microsoft
(the thing that helps constrain LLM outputs).
i'm a big fan of F# and functional programming in practice, and i'm constantly working on
various smaller projects and libraries in my spare time, which you can find
on my github.
some highlights include fflat, a tiny native
compiler for F#, built on top of bflat, a plugin for obsidian, a note taking app i use daily and
r-toml which is just a little demo of how automata based techniques can process .toml files
vastly faster than traditional parsers. i enjoy microoptimizing things and do a
lot of benchmarking both in research and other projects, so if you like reading assembly or want 'the fastest possible'
implementations of something, you might find something interesting through my repos.
papers
interests
- automata theory and formal languages
- functional programming (F#)
- compilers and systems programming (Rust)
- minimalism and cybersecurity
- linux, embedded systems
- dogs, coffee, lifting?
presentations