|
I'm interested in defining a lightweight library for mutable byte arrays. I'm attaching a draft. My goals are these: define something which can be implement reasonably efficiently in stock Lua 5.1, that is easy to interoperate with at the C level, and which can be optimized by Lua compilers. Eventually, I hope that different libraries will be able to exchange data as byte arrays instead of strings (where this makes sense), reducing allocations and avoiding hashing. The C interface is the main reason why the arrays are not resizable: Otherwise the address of the underlying memory region could change while you cling to a reference of the array object. What do you think about this idea?
This document aims to define the semantics of a Lua module, bytes.
Throughout this document, the identifier bytes
refers to this module. The objects which are handled by this
module are called byte arrays. Conceptually, byte
arrays are Lua tables which have a fixed length, and which
can only store integers in the range from 0 to 255 (inclusive).
Preconditions are listed in a comment before the _expression_
which is defined; variables in the precondition are given
in italics
. Different definitions based on
preconditions are sometimes listed; if no preconditions match at run time,
an error is signaled.
-- b byte array #b
Evaluates to the number of bytes in b
.
-- b byte array b[i]
If i
is an integer between 1 and #b
(inclusive), evaluates to the byte at this position, as a number. An
error is raised if i
is
nil
. Otherwise, returns nil
.
-- i non-negative integer bytes.new(i)
Creates a new byte array. The array is initialized to zeros, that is,
b[j] == 0
for all 1 ≤ j
≤ i
.
Byte arrays are distinct objects (even the byte array of length zero),
and only compare equal to themselves.
-- b byte array b[i] = n
If not 0 ≤ i
≤ #b
, or if
n
is not an integer, an error is raised. Otherwise, sets
the byte at position i
in b
to n %
256
, so that b[i] == (n % 256)
holds after the
operation.
-- b byte array, i, j integers bytes.byte(b, i) bytes.byte(b, i, j)
The two-argument form and the three-argument form with j ==
nil
evaluate to b[i]
. If i
and j
are integers between 1 and #b
with
i
≤ j
, the three-argument form evaluates to
b[i], b[i + 1], …, b[j]
. If j
is less than
i
, bytes.byte(b, i, j)
evaluates to no
values. If j
is larger than #b
, the
_expression_ evaluates to b[i], b[i + 1], …, b[#b], nil, …,
nil
with j - #b
occurrences of
nil
.
-- b byte array, tostring the standard function of that name tostring(b)
Evaluates to a string of length #b
, containing as
characters the bytes in b
. Equivalent to
string.char(bytes.byte(b, 1, #b))
.
-- src, dst byte arrays, spos, dpos, len integers bytes.copy(dst, dpos, src, spos, len)
Copies a slice from the byte array src
to dst
.
Does not return anything.
Equivalent to
dst[dpos], dst[dpos + 1], …, dst[dpos + len - 1] = src[spos], src[spos + 1], …, src[spos + len - 1]
,
except that it is unspecified if any writes occur if
dpos + len - 1 > #dst
or
spos + len - 1 > #src
.
-- src byte array, dst string, spos, dpos, len integers bytes.copy(dst, dpos, src, spos, len)
Equivalent to
dst[dpos], dst[dpos + 1], …, dst[dpos + len - 1] = string.sub(src, spos, spos + len - 1)
(where string
refers to the standard string library),
except that it is unspecified if any writes occur if
dpos + len - 1 > #dst
or
spos + len - 1 > #src
.
-- s string bytes.new(s)
Returns a copy of the string as a byte array. Equivalent to
ρ = bytes.new(#s); bytes.copy(ρ, 1, s, 1, #s)
, where
ρ
is the value of the _expression_ bytes.new(s)
.
At the C level, the byte array objects are userdata.
lua_touserdata
can be used to obtain a pointer to the
start of the byte array, and lua_objlen
can
be used to obtain the length in bytes. The metatable
for byte arrays is registered under the name
"bytes.bytearray"
.