Elixir lexer now with 300% more blazingly fast and 300% less memory usage#255
Elixir lexer now with 300% more blazingly fast and 300% less memory usage#255madlep wants to merge 3 commits into
Conversation
- make everything pure tail recursive, rather than returning tuples from helper function - figure out length of token, and do single binary pattern match to extract it one go as a sub binary match rather than building an accumulator - do one main pattern match in one place, rather than several separate ones - when matching keywords, keep what we've matched rather than matching all over again when figuring out if it's a keyword or identifier at the end
| ## Example | ||
|
|
||
| iex> Lexer.init("let five = 5;") | ||
| iex> Monkey.Lexer.init("let five = 5;") |
There was a problem hiding this comment.
You don't need that if the alias in the test ¯\(ツ)/¯, but fine with me either way
There was a problem hiding this comment.
Is weird, as it works fine in ExUnit, but when I had it in a livebook, it was complaining about it. I think it's a bug in livebook though. But adding it makes it work either way 🤷♂️
|
I 100% had the intermediary step with variable assignment (returning They did the on-stream review, so cool with optimizing it. |
|
I think your work and review matters, but I’m no collaborator either |
Yeah, I watched that, it is was good to follow along with. The previous version is much more readable than this version. It's definitely sacrificing readability for performance here. |
Was thinking about this, and worked in a bunch of optimisations to make things ✨ blazingly faster ✨
TL;DR - 3x faster, 3x less memory usage
lex/2, and callingtokenize/2to return a tuple of{token, rest}, just do everything pure tail recursively. This is the main speed and memory boost because:call_lastinstruction rather thancall- just is faster.caseexpression rather than separate function heads. They both work identically (and have verified the generated BEAM byte code is the same in both cases), but because we're passingtokens, and doing another function call in each case, it gets messier as function heads. Becausetokensis already an argument, we can just pass that along rather than having to explicitly declare it as an argument each time.Monkey.OldLexerApologies to @ryanwinchester . Was gonna make just a couple of tweaks, but ended up touching pretty much all the original code. The same logic is still there. It's just all being called a bit differently.