Trying some regex performance tests (heard some rumors that erlang is slow)
>Fun = fun F(X) -> case X > 1000000 of true -> ok; false -> Y = X + 1, re:run(<<"1ab1jgjggghjgjgjhhhhhhhhhhhhhjgdfgfdgdfgdfgdfgdfgdfgdfgdfgdfgfgv">>, "^[a-zA-Z0-9_]+$"), F(Y) end end.
#Fun<erl_eval.30.128620087>
> timer:tc(Fun, [0]).
{17233982,ok}
> timer:tc(Fun, [0]).
{17155982,ok}
and some tests after compiling regex
{ok, MP} = re:compile("^[a-zA-Z0-9_]+$").
{ok,{re_pattern,0,0,0,
<<69,82,67,80,107,0,0,0,16,0,0,0,1,0,0,0,255,255,255,
255,255,255,...>>}}
> Fun = fun F(X) -> case X > 1000000 of true -> ok; false -> Y = X + 1, re:run(<<"1ab1jgjggghjgjgjhhhhhhhhhhhhhjgdfgfdgdfgdfgdfgdfgdfgdfgdfgdfgfgv">>, MP), F(Y) end end.
#Fun<erl_eval.30.128620087>
> timer:tc(Fun, [0]).
{15796985,ok}
>
> timer:tc(Fun, [0]).
{15921984,ok}
http://erlang.org/doc/man/timer.html :
Unless otherwise stated, time is always measured in milliseconds.
http://erlang.org/doc/man/re.html#compile-1 :
Compiling the regular expression before matching is useful if the same expression is to be used in matching against multiple subjects during the lifetime of the program. Compiling once and executing many times is far more efficient than compiling each time one wants to match.
Questions
- Why is it returning microseconds to me?(should be milliseconds?)
- Compiling regex doesn't make much difference, why?
- Should i bother compiling it?
2条答案
按热度按时间rmbxnbpk1#
Fun
need to compile the string"^[a-zA-Z0-9_]+$"
every single recursive (1 million times) in case 1. By contrast, you do the compile first in case 2. After that you bring the result into the recursive, so this is reason why the performance is low than case 1.run(Subject, RE) -> {match, Captured} | nomatch
Subject = iodata() | unicode:charlist()
RE = mp() | iodata()
The regular expression can be specified either as iodata() in which case it is automatically compiled (as by compile/2) and executed, or as a precompiled mp() in which case it is executed against the subject directly.
q3aa05252#
Yes, you should compile the code before trying to measure performance. When you type the code into the shell, the code will be interpreted, not compiled into byte code. I saw a big improvement when putting the code into a module:
(Both of those are with compiled regexp.)