lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Sun, Oct 17, 2010 at 05:56, Alexander Gladysh <agladysh@gmail.com> wrote:
> Jerome, Petite, list,

>>> I'm trying to load that 3M entries in to Lua table in memory faster
>>> than I do it now.

> Thanks for answers.

> Sorry, it is 6 AM now, I'll try your solutions next evening.

Small addendum:

1. The proper solution to my whole task is to use some DB (like Tokyo
Cabinet). But the question still stays -- (a) I'm quite curious about
this and (b) rewriting code to DB will take time which I don't have
now. (That is another reason for my homework remark -- thanks for
help, guys!)

2.  What I want to try (feel free to do this for me if you want):

A. Try to distribute loaded data to buckets, provide proxy for
accessing them (have to use newproxy() since I need __len, but that is
OK here.)

B. Play with GC -- perhaps more aggressive setting will help.

C. Measure what happens with plain Lua (I use LuaJIT2 beta 5).

D. Try to read data in large chunks (4096 bytes), and
search-and-replace {...}; to _A{...};, where _A is append function,
available in environment. The problem is how to write proper regexp so
if table is in two chunks it is still wrapped correctly.

E. BTW, I tried to loadstring() each line separately -- it is much
slower. Need to loadstring, say, 1000 lines.

F. Custom parsers -- no fun. But try to convert data to luabins format
line-by-line (one line  -> size_t (en, luabins blob) and load that.

Z. Proper profiling.

Alexander.

P.S. Progress to date:

at line 100000 : Sun Oct 17 04:47:10 2010
at line 200000 : Sun Oct 17 04:48:29 2010
at line 300000 : Sun Oct 17 04:50:18 2010
at line 400000 : Sun Oct 17 04:53:02 2010
at line 500000 : Sun Oct 17 04:55:52 2010
at line 600000 : Sun Oct 17 04:58:55 2010
at line 700000 : Sun Oct 17 05:01:26 2010
at line 800000 : Sun Oct 17 05:07:00 2010
at line 900000 : Sun Oct 17 05:10:18 2010
at line 1000000 : Sun Oct 17 05:15:32 2010
at line 1100000 : Sun Oct 17 05:21:47 2010
at line 1200000 : Sun Oct 17 05:26:03 2010
at line 1300000 : Sun Oct 17 05:33:27 2010
at line 1400000 : Sun Oct 17 05:36:38 2010
at line 1500000 : Sun Oct 17 05:43:18 2010
at line 1600000 : Sun Oct 17 05:52:59 2010
at line 1700000 : Sun Oct 17 06:00:46 2010
at line 1800000 : Sun Oct 17 06:05:41 2010