[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: table length operator
- From: Gregg Reynolds <dev@...>
- Date: Fri, 17 Mar 2006 23:25:19 -0600
Hello,
Newbie here with an observation or two related to the recent threads on
the "#" operator.
>From the ISO standard for Z (a model of clarity and simplicity, IMO):
"binding - finite function from names to values"
"schema - set of bindings"
"A finite sequence is a finite indexed set of values of the same type,
whose domain is a contiguous set of positive integers starting at 1.
_seq X_ is the set of all finite sequences of values of _X_, that is, of
finite functions from the set _1..n_, for some n, to elements of _X_."
Lua tables look like a Z schema sometimes, and a sequence sometimes. A
table _qua_ associative array is equivalent to a Z schema; a table _qua_
integer-indexed array is equivalent to a Z sequence.
Suggestion number 1 is to drop the "array" nomenclature in favor of
"sequence", on the grounds that the former is vague,
implementation-oriented, and unmathematical, whereas the latter is
precise, abstract, unambiguous, and - oh, joy! - enshrined in an
international standard.
This would just be a change in metalanguage - a table that fits the
description looks and acts like a sequence; otherwise, it looks and acts
like a schema, even if it starts with a contiguous run of indices.
Neither sequences nor schemata can map a domain value to nil, by
definition; so the notion of a "hole" in the table is meaningless.
Under this metalanguage, the following does *not* define a sequence, but
a schema in which 5 serves as a name that happens to also denote a number:
t = {}
t[5] = 1
Note that both sequences and schemata are defined in terms of sets. In
Z, the symbol '#' is the _cardinality_ function - not the length
function, even though it is defined in the same section as "sequence".
It means the number of members of a set, which, linguistically at least,
is not at all the same as length. A set of billiard balls could be
construed as a sequence, but nobody would say it has a length of 15.
(Which goes to show that terms "length", like "array", can infect clear
thought with misleading physical metaphor.)
So proposal number two is to add a cardinality operator to the language.
I would suggest '##', leaving '#' to mean length for compatibility
purposes. Then if sq is a sequence table, we have a law that says
##(sq) = #(sq).
To go with that I propose that the "length" of a table t be defined only
for sequence tables, but deprecated. If t is a schema table, '#(t)' is
undefined in principle, nil in implementation - it has a cardinality but
not a length. It strikes me as very messy and counter-intuitive to give
'#' a kind of partial meaning for non-sequences, as the language seems
to do now.
Currently, the way '#' is defined, we have no native means of evaluating
the cardinality of a table that is not a sequence, which is quite
surprising to me. (Maybe I've missed something?)
Regardless of the fate of those suggestions, the Language Manual for 5.1
seems inconsistent on this subject. In section 2.2 it states
unequivocally that tables "can contain values of all types (except
nil)", and "the value of a table field can be of any type (except nil)".
But in 2.5 ("The Length Operator"), it states "If the array has
'holes' (that is, nil values between other non-nil values)...". One way
or another that should be fixed.
Lastly, I hope nobody takes this as too critical of Lua. I just happen
to be mildly obsessed with programming language definition style and
metalanguage. I've only recently begun looking at Lua seriously and I'm
very impressed at the design concept and implementation both. I was all
set to use Ruby on Rails for a web project when I decided to glance at
Lua, and now I've pretty much decided to take my chances with Lua. I
can think of a million things to do with it already.
best wishes (and sorry about the length),
gregg