lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


2007/10/8, PA <petite.abeille@gmail.com>:
> Hello,
>
> What would be an effective way to implement Tim Bray's Wide Finder in
> Lua?
>
> http://www.tbray.org/ongoing/When/200x/2007/09/20/Wide-Finder
> http://www.tbray.org/tmp/o10k.ap
>
> Here is a rather slow implementation:
>
> local aMap = {}
> local aList = {}
>
> for aPage in io.read( '*a' ):gmatch( 'GET /ongoing/When/200x/([^%.]+) '
> ) do
>      aMap[ aPage ] = ( aMap[ aPage ] or 0 ) + 1
> end
>
> for aPage, aCount in pairs( aMap ) do
>      aList[ #aList + 1 ] = { aCount, aPage }
> end
>
> table.sort( aList, function( first, second ) return first[ 1 ] >
> second[ 1 ] end )
>
> for anIndex = 1, 10 do
>      print( aList[ anIndex ][ 1 ], aList[ anIndex ][ 2 ] )
> end
>
> LuaProfiler reports that the above Lua implementation is spending it's
> time as follow:
>
> gmatch 60%
> read   23%
> sort   15%
>
> -- Naive Lua implementation
> % /usr/bin/time lua widefinder.lua < o10k.txt
>          0.42 real         0.32 user         0.05 sys
>
> -- Tim Bray ruby implementation
> % /usr/bin/time ruby widefinder.rb < o10k.txt
>          0.18 real         0.15 user         0.02 sys
>
> -- Bill de hÓra python implementation
> % /usr/bin/time python widefinder.py o10k.txt
>          0.27 real         0.16 user         0.07 sys
>
>

One simple improvement might be to reduce the number of gmatch calls:

local aMap = {}
local aMap2 = {}
local aList = {}

for aPage in io.read( '*a' ) do
    aMap[ aPage ] = ( aMap[ aPage ] or 0 ) + 1
end

for aPage, count in pairs(aMap) do
  aMap2[aPage:gmatch( 'GET /ongoing/When/200x/([^%.]+) ')] = count
end
aMap = nil

This way, you only have to do one gmatch per actual page instead of
once for every line in the log