[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: C structs library
- From: Sam Roberts <vieuxtech@...>
- Date: Tue, 9 Mar 2010 12:07:04 -0800
On Mon, Mar 8, 2010 at 5:57 PM, David Manura <dm.lua@math2.org> wrote:
> On Fri, Mar 5, 2010 at 2:37 PM, Mark Szpakowski wrote:
>> I notice that the C structs library, downloadable from and described at
>> http://www.inf.puc-rio.br/~roberto/struct/, lists 3 functions in its API
>> (stuct.pack, struct.unpack, and struct.size). However, struct.size is
>> missing:
>
> Appears so. It was also asked here:
>
> http://lua-users.org/lists/lua-l/2009-10/msg00472.html .
> http://lua-users.org/lists/lua-l/2009-03/msg00382.html
>
> BTW, new page: http://lua-users.org/wiki/StructurePacking .
Maybe this is useful to someone. Unable to find a succinct summary of
the differences between struct and pack, and frustrated by the
mostly-but-not-quite overlapping feature sets, I wrote up this
description of the choices.
YMMV, but for background, we currently are using pack, chosen at
random a few years ago, and I am absolutely not looking for some
heavyweight object-to-binary mapping library.
# Comparison of features for various pack/unpack libraries.
## pack:
+ has a repetition operator
+ has support for lua_Number (though sensible people leave this as
double, so this
isn't normally useful)
+ has "=" to reset to native endianness
- doesn't support platform independent sized integers
- doesn't support size_t
- doesn't support pascal strings with 4-byte sizes on 64-bit architectures
- doesn't support padding
- doesn't support alignment
- documentation is poor (split across the readme and source file,
doesn't mention
important details such as what the size of a "word" is, or which characters
are ignored in the format string)
- inconsistent and hard to remember format characters (upper case is
unsigned, except
lower case "b"; the pascal string problems)
## struct:
+ supports platform independent sized integers
+ elegant support for all pascal strings
- doesn't support a repetition operator
- doesn't support size_t
- doesn't support "," as an ignored character in the format string
- documentation is non-existent
## lunary:
Lots of code to make sockets, files, and strings look similar, which
we don't need.
Complex data structure support.
Mid-size code base, but not-trivial to read (like struct and pack).
Nothing about it really caught my eye as being enough of an improvement.
## vstruct:
Lots of code, include a lexer/parser, compiled patterns, etc. Definitely not
trivial to read.
However, looked like it had good support for complex structures, and I
particularly liked its syntax for packing/unpacking directly into tables,
including tables with named fields. Also supports bit fields, including named
bitfields.
Its a big step to move to this, but it might be worth considering.
Its almost completely native lua, though whether that makes a difference to us
is not clear, we usually run too fast, not too slow.
# What we could do:
Patch pack to use unsigned int instead of size_t for "a".
Start using vstruct instead of pack.
Patch vstruct with the changes from the mailing list.
Add _ to vstruct to mean "underlying endianness"
Fix the struct unit tests, they assume long is 32-bit.
# Summary of pack and struct format support
"*" is "both are the same"
*: > - big endian
*: < - little endian
pack: = - native
struct: ![num] - alignment
struct: x - padding
struct: b/B - signed/unsigned byte
pack: c/b - ditto
*: h/H - signed/unsigned short
*: l/L - signed/unsigned long
pack: i/I - signed/unsigned int
struct: i/In - signed/unsigned integer with size `n' (default is size of int)
*: f - float
*: d - double
pack: n - a lua_Number (defaults to double, but theoretically can be different)
*: ' ' - ignored
pack: ',' - ignored
struct: s - zero-terminated string
pack: z - ditto
pack: An - on write, n is repetition
pack: An - on read, n is width
pack: p /* string preceded by length byte */
pack: P /* string preceded by length short */
pack: a /* string preceded by length size_t */
struct: cn - sequence of `n' chars (from/to a string); when packing, n==0 means
the whole string; when unpacking, n==0 means use the previous
read number as the string length
Pack's use of size_t is not good, we will change it to int, and maybe use
'Z' for size_t pascal strings. struct's approach is better, it can do any
size.
pack: <fmt>n - same as <fmt> repeated n times, except "A"
# Actual sizes
Note that long and size_t tend to be the largest size, 32 or 64 bit depending
on system, so uses of them for sizes in networking code are not portable.
64-bit:
% ./bin/sz
8 bytes void* (unsigned)
8 bytes function* (unsigned)
1 bytes char
2 bytes short
4 bytes int
8 bytes long
8 bytes long long
4 bytes float
4 bytes float
8 bytes double
16 bytes long double
8 bytes time_t
8 bytes suseconds_t
4 bytes pid_t
4 bytes wchar_t
8 bytes size_t (unsigned)
8 bytes ptrdiff_t
8 bytes ssize_t
8 bytes intmax_t
8 bytes uintmax_t (unsigned)
32-bit:
4 bytes void* (unsigned)
4 bytes function* (unsigned)
1 bytes char
2 bytes short
4 bytes int
4 bytes long
8 bytes long long
4 bytes float
4 bytes float
8 bytes double
12 bytes long double
4 bytes time_t
4 bytes suseconds_t
4 bytes pid_t
4 bytes wchar_t
4 bytes size_t (unsigned)
4 bytes ptrdiff_t
4 bytes ssize_t
8 bytes intmax_t
8 bytes uintmax_t (unsigned)
Above is from a utility I keep around:
#include<inttypes.h>
#include<stdio.h>
#include<stddef.h>
#include<stdlib.h>
#define SZ(x) printf("%2zd bytes " #x "%s\n", sizeof(x), (0 > (x) -1)
? "" : " (unsigned)")
typedef void function(void);
int main()
{
SZ(void*);
SZ(function*);
SZ(char);
SZ(short);
SZ(int);
SZ(long);
SZ(long long);
SZ(float);
SZ(float);
SZ(double);
SZ(long double);
SZ(time_t);
SZ(suseconds_t);
SZ(pid_t);
SZ(wchar_t);
SZ(size_t);
SZ(ptrdiff_t);
SZ(ssize_t);
SZ(intmax_t);
SZ(uintmax_t);
return 0;
}