[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: precompiled regex's to lua
- From: Karel Tuma <ktuma@...>
- Date: Thu, 9 Nov 2006 23:23:12 +0100
just benchmark it :)
lua of course wins, but not always.
the thing with compiling regexs, they're FSA (finite state automata)
basically tree of states, where nodes can point meshed to other nodes.
FSA compiler is considered non-trivial in terms of implementation size.
but lua patterns are sort of NFA (nonfinite), they're already compiled as they
are. due to that referring back to some state or sub-expressions
is very limited making lua patterns "less powerful" than pcre's, but
enough for the usual daywork and when you need something more,
you've to code the logic all by yourself.
to be exact FSA is slower for "simple" expressions where NFA on the
complex ones, some pcre implementations even use both and choose
between them depending on the expression (!).
hey, this is lua, we want the simple and fast, right?
add to this perl's suckyness on much everything else and lua is the winner
(using lua for parsing 1Gb+ logs using mmap(), look at
http://blua.leet.cz/sep/STRHOOK_PATCH.patch to get the idea)
On Thu, Nov 09, 2006 at 09:58:22PM +0100, Asko Kauppi wrote:
>
> I didn't find any reference to discussing precompiled regular
> expressions, and Lua.
>
> Some background:
>
> In huge log file handling, Lua loses to Perl (not by much!) seemingly
> because it has no concept of precompiling, and then applying the
> regular expression patterns.
>
> In Perl, one can:
>
> my $re= qr/^\d+\s/;
> $var =~ $re; # $re is a precompiled, optimized regex, applied to
> $var
> or:
> $var =~ /^\d+\s/o; # 'o' for once, compile once, cache, reuse
>
> Lua:
> string.match( var, "%d+%s" ) -- is "%d+%s" analyzed anew each
> time?
>
>
> Is Lua losing in speed, since it has not this concept, or have the
> authors been so clever, it's already "there", just invisible? :)
> We're talking BIG files, say 1 million lines, or more.
>
> -asko
>
>