just benchmark it :)
lua of course wins, but not always.
the thing with compiling regexs, they're FSA (finite
state automata)
basically tree of states, where nodes can point meshed
to other nodes.
FSA compiler is considered non-trivial in terms of
implementation size.
but lua patterns are sort of NFA (nonfinite), they're
already compiled as they
are. due to that referring back to some state or
sub-expressions
is very limited making lua patterns "less powerful" than
pcre's, but
enough for the usual daywork and when you need something
more,
you've to code the logic all by yourself.
to be exact FSA is slower for "simple" expressions where
NFA on the
complex ones, some pcre implementations even use both
and choose
between them depending on the expression (!).
hey, this is lua, we want the simple and fast, right?
add to this perl's suckyness on much everything else and
lua is the winner
(using lua for parsing 1Gb+ logs using mmap(), look at
http://blua.leet.cz/sep/STRHOOK_PATCH.patch to get the
idea)
On Thu, Nov 09, 2006 at 09:58:22PM +0100, Asko Kauppi
wrote:
I didn't find any reference to discussing precompiled
regular
expressions, and Lua.
Some background:
In huge log file handling, Lua loses to Perl (not by
much!) seemingly
because it has no concept of precompiling, and then
applying the
regular expression patterns.
In Perl, one can:
my $re= qr/^\d+\s/;
$var =~ $re; # $re is a precompiled, optimized regex,
applied to
$var
or:
$var =~ /^\d+\s/o; # 'o' for once, compile once, cache,
reuse
Lua:
string.match( var, "%d+%s" ) -- is "%d+%s" analyzed
anew each
time?
Is Lua losing in speed, since it has not this concept,
or have the
authors been so clever, it's already "there", just
invisible? :)
We're talking BIG files, say 1 million lines, or more.
-asko