lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


I absolutely love this file-system as db approach.
And if you have concenrs about performance I might quote a friend of mine:
when asked why they dumped Oracle in favour of a fs-based approach for a
data-storage system dealing with terabytes of data he said:

a simple sql select was 100ms, a nfs file read was 10ms, a local disk read
was 1ms.

although you will be needing some kind of transactions / locking mechanism
if you have several processes accessing the data simultan...

just my 2c, Martin

> > Because the data is sampled every three seconds for
> > several days, this data base could get large fast.
> > However I'm more interested in speed than disk space.
> > Tho I don't want have to take up all of the computer
> > memory with the data file.
>
> Use the filesystem as the database.  That is:
>
> > YEAR:MONTH:DAY:SECONDS_FROM_MIDNIGHT as the key, then
> > a channel number in array format, to get the values.
>
> The directory structure might look like:
>
> Year/
> Month/
> Day/
> Hour/
> Minute/
>
> Within the final directory, either have each sample as a separate file, or
> bundle them into a single file.  I would probably make each a separate
file.
>
> I'm not going to do the real math here, but let's say you get back a
sample
> with the time stamp of 9876543210 (a made-up number).  Let's say this
> decomposes down to:
>
> Year: 2001
> Month: 7
> Day: 19
> Hour: 2
> Minute: 6
> Second: 30
>
> I would then create a file with the following path:
>
> 2001/7/19/2/6/9876543210
>
> The file 9876543210 might contain something like this:
>
> return {1,2,3, ... ,16,17}
>
> So to get back the table of sample values for a particular time:
>
> x = dofile("2001/7/19/2/6/9876543210")
>
> The only downside to this is that Lua doesn't have any built-in functions
to
> deal with walking directory structures.  You could add that.  Extending
Lua
> is (somewhat) easy and even fun to do.
>
> If dealing with individual files is too slow, you can create at each
> directory level that summarizes information.  For example, if a common
> request is to look at sample date per hour, you might create in each
> Year/Month/Day/Hour directory a file named "summary" that predigests all
the
> samples for that hour's period.
>
> > One of the other requirements is that I be able to
> > delete the data, on a day/date range bases.
> > Just put in NIL's?
>
> Want to delete all data for July 19th, 2001?
>
> rm -rf 2001/7/19
>
> Deleting more fine periods of data (say, between 2:34:20am and 6:26:50pm)
> would require a bit more logic.  But if you know ahead of time the kinds
of
> ranges you'll want to delete data, you can build a directory structure to
> take advantage of it.
>
>