Lua Type Checking |
|
> x = 5 + "ok" stdin:1: attempt to perform arithmetic on a string value
However, unlike in languages like C, there is no built-in mechanism in Lua for type checking the parameters and return values of function calls. The types are left unspecified:
function abs(x) return x >= 0 and x or -x end
This allows a lot of flexibility. For example, a function like print
can accept values of many types. However, it can leave functions underspecified and more prone to usage errors. You are allowed to call this function on something other than a number, though the operations inside the function can will trigger a somewhat cryptic error at run-time (where it would be a compile-time error in C).
> print(abs("hello")) stdin:2: attempt to compare number with string stack traceback: stdin:2: in function 'abs' stdin:1: in main chunk [C]: ?
For improved error reporting, one is typically told to do something like this:
function abs(x) assert(type(x) == "number", "abs expects a number") return x >= 0 and x or -x end
> print(abs("hello")) stdin:2: abs expects a number stack traceback: [C]: in function 'assert' stdin:2: in function 'abs' stdin:1: in main chunk [C]: ?
That's a good suggestion, but one might complain that this imposes additional run-time overhead, it will only detect program errors for code that gets executed instead of all code that is compiled, the types of the function values are entirely in the function's implementation (not available by introspection), and it can involve a lot of repetitive code (especially when the named arguments are passed via a table).
Here is one ways to proactively check named parameters:
local Box = {} local is_key = {x=true,y=true,width=true,height=true,color=true} function create_box(t) local x = t.x or 0 local y = t.y or 0 local width = t.width or 0 local height = t.height or 0 local color = t.color assert(type(x) == "number", "x must be number or nil") assert(type(y) == "number", "y must be number or nil") assert(type(width) == "number", "width must be number be number or nil") assert(type(height) == "number", "height must be number or nil") assert(color == "red" or color == "blue", "color must be 'red' or 'blue'") for k,v in pairs(t) do assert(is_key[k], tostring(k) .. " not valid key") end return setmetatable({x1=x,y1=y,x2=x+width,y2=y+width,color=color}, Box) end
It is relatively a lot of code. In fact, we might want to use error
rather than assert
to provide an appropriate level
parameter for the stack traceback.
Another approach is to place the type check code outside of the original function, potentially with a "function decorator" (see DecoratorsAndDocstrings for background).
is_typecheck = true function typecheck(...) return function(f) return function(...) assert(false, "FIX-TODO: ADD SOME GENERIC TYPE CHECK CODE HERE") return f(...) end end end function notypecheck() return function(f) return f end end typecheck = is_typecheck and typecheck or notypecheck sum = typecheck("number", "number", "->", "number")( function(a,b) return a + b end )
The advantage is that the type information is outside of the function implementation. We can disable all the type checking by switching a single variable, and no added overhead would remain when the functions are executed (though there is some slight added overhead when the functions are built). The typecheck
function could also store away the type info for later introspection.
This approach is similar to the one described in LuaList:/2002-07/msg00209.html (warning: Lua 4).
One way to implemnt such type checking decorators is:
--- Odin Kroeger, 2022, released under the MIT license. do local abbrs = { ['%*'] = 'boolean|function|number|string|table|thread|userdata', ['%?(.*)'] = '%1|nil' } local msg = 'expected %s, got %s.' --- Check whether a value is of a type. -- -- Type declaration grammar: -- -- Declare one or more Lua type names separated by '|' to require that -- the given value is of one of the given types (e.g., 'string|table' -- requires the value to be a string or a table). '*' is short for the -- list of all types but `nil`. '?T' is short for 'T|nil' (e.g., -- '?table' is short for 'table|nil'). -- -- Extended Backus-Naur Form: -- -- > Type = 'boolean' | 'function' | 'nil' | 'number' | -- > 'string' | 'table' | 'thread' | 'userdata' -- > -- > Type list = [ '?' ], type, { '|', type } -- > -- > Wildcard = [ '?' ], '*' -- > -- > Type declaration = type list | wildcard -- -- Complex types: -- -- You can check types of table or userdata fields by -- declarding a table that maps indices to declarations. -- -- > type_check({1, '2'}, {'number', 'number'}) -- nil index 2: expected number, got string. -- > type_check({foo = 'bar'}, {foo = '?table'}) -- nil index foo: expected table or nil, got string. -- > type_check('foo', {foo = '?table'}) -- nil expected table or userdata, got string. -- -- Wrong type names (e.g., 'int') do *not* throw an error. -- -- @param val A value. -- @tparam string|table decl A type declaration. -- @treturn[1] bool `true` if the value matches the declaration. -- @treturn[2] nil `nil` otherwise. -- @treturn[2] string An error message. function type_match (val, decl, _seen) local t = type(decl) if t == 'string' then local t = type(val) for p, r in pairs(abbrs) do decl = decl:gsub(p, r) end for e in decl:gmatch '[^|]+' do if t == e then return true end end return nil, msg:format(decl:gsub('|', ' or '), t) elseif t == 'table' then local ok, err = type_match(val, 'table|userdata') if not ok then return nil, err end if not _seen then _seen = {} end assert(not _seen[val], 'cycle in data tree.') _seen[val] = true for k, t in pairs(decl) do ok, err = type_match(val[k], t, _seen) if not ok then return nil, format('index %s: %s', k, err) end end return true end error(msg:format('string or table', t)) end end --- Type-check function arguments. -- -- Type declaration grammar: -- -- The type declaration syntax is that of @{type_match}, save for -- that you can use '...' to declare that the remaining arguments -- are of the same type as the previous one. -- -- Obscure Lua errors may indicate that forgot the quotes around '...'. -- -- Caveats: -- -- * Wrong type names (e.g., 'int') do *not* throw an error. -- * Sometimes the strack trace is wrong. -- -- @tparam string|table ... Type declarations. -- @treturn func A function that adds type checks to a function. -- -- @usage -- store = type_check('?*', 'table', '?number', '...')( -- function (val, tab, ...) -- local indices = table.pack(...) -- for i = 1, n do tab[indices[i]] = val end -- end -- ) -- -- @function type_check function type_check (...) local decls = pack(...) return function (func) return function (...) local args = pack(...) local decl, prev local n = math.max(decls.n, args.n) for i = 1, n do if decls[i] == '...' then prev = true elseif decls[i] then prev = false decl = decls[i] elseif not prev then break end if args[i] == nil and prev and i >= decls.n then break end local ok, err = type_match(args[i], decl) if not ok then error(format('argument %d: %s', i, err), 2) end end return func(...) end end end
The solutions above present some limitations:
* they are rather verbose, and non-trivial verifications can be hard to read;
* the error messages aren't as clear as those returned by Lua's primitives. Moreover, they indicate an error in the called function, at the place where assert() fails, rather than in the calling function which passed invalid arguments.
The checks library offers a terse, flexible and readable way to produce good error messages. Types are described by strings, which can of course be Lua type names, but can also be stored in an object's metatable, under the __type field. Additional, arbitrary type-checking functions can also be registered in a dedicated checkers
table. For instance, if one wants to check for an IP port number, which must be between 0 and 0xffff, one can define a port
type as follows:
function checkers.port(x) return type(x)=='number' and 0<=x and x<=0xffff and math.floor(x)==0 end
To remove useless boilerplate code, checks()
retrieves the parameters directly from the stack frame, no need to repeat them; for instance, if function f(num, str)
needs a number and a string, it can be implemented as follows:
function f(num, str) checks('number', 'string') --actual function body end
Types can be combined:
* the vertical bar allows to accept several types, e.g. checks('string|number')
accepts both strings and numbers as first parameter.
* a prefix "?" makes a type optional, i.e. also accepts nil
. Functionally, it's equivalent to a "nil|"
prefix, although it's more readable and faster to check at runtime.
* the question mark can be combined with union bars, e.g. checks('?string|number')
accepts strings, numbers and nil
.
* finally, a special "!"
type accepts anything but nil
.
A more detailed description of how the library works can be found in its source's header (https://github.com/fab13n/checks/blob/master/checks.c). The library is part of Sierra Wireless' application framework, accessible here: https://github.com/SierraWireless/luasched. For convinience, it's also available here as a standalone rock: https://github.com/fab13n/checks
As mentioned, run-time type checking will not detect program errors that don't get executed. Extensive test suites are particularly essential for programs in dynamically typed languages so that all branches of the code are executed with all conceivable data sets (or at least a good representation of them) so that the run-time assertions are are sufficiently hit. You can't rely as much on the compiler to do some of this checking for you.
Perhaps we can mitigate this by carrying more complete type information with the values. Below is one approach, though is it more a novel proof-of-concept rather than anything for production use at this point.
-- ExTypeCheck.lua ("ExTypeCheck") -- Type checking for Lua. -- -- In this type model, types are associated with values at run-time. -- A type consists of the set of values the value could have -- at run-time. This set can be large or infinite, so we -- store only a small representative subset of those values. -- Typically one would want to include at least the boundary -- values (e.g. max and min) in this set. -- Type checking is performed by verifying that all values -- in that set are accepted by a predicate function for that type. -- This predicate function takes a values and returns true or false -- whether the value is a member of that type. -- -- As an alternative to representing types as a set of representative -- values, we could represent types more formally, such as with -- first-order logic, but then we get into theorem proving, -- which is more involved. -- -- DavidManura, 2007, licensed under the same terms as Lua itself. local M = {} -- Stored Expression design pattern -- ( http://lua-users.org/wiki/StatementsInExpressions ) local StoredExpression do local function call(self, ...) self.__index = {n = select('#', ...), ...} return ... end function StoredExpression() local self = {__call = call} return setmetatable(self, self) end end -- Whether to enable type checking (true/false). Default true. local is_typecheck = true -- TypeValue is an abstract type for values that are typed -- This holds the both the actual value and a subset of possible -- values the value could assume at runtime. That set should at least -- include the min and max values (for bounds checking). local TypedValue = {} -- Check that value x satisfies type predicate function f. function M.check_type(x, f) for _,v in ipairs(x) do assert(f(v)) end return x.v end -- Type check function that decorates functions. -- Example: -- abs = typecheck(ranged_real'(-inf,inf)', '->', ranged_real'[0,inf)')( -- function(x) return x >= 0 and x or -x end -- ) function M.typecheck(...) local types = {...} return function(f) local function check(i, ...) -- Check types of return values. if types[i] == "->" then i = i + 1 end local j = i while types[i] ~= nil do M.check_type(select(i - j + 1, ...), types[i]) i = i + 1 end return ... end return function(...) -- Check types of input parameters. local i = 1 while types[i] ~= nil and types[i] ~= "->" do M.check_type(select(i, ...), types[i]) i = i + 1 end return check(i, f(...)) -- call function end end end function M.notypecheck() return function(f) return f end end function M.nounbox(x) return x end M.typecheck = is_typecheck and M.typecheck or M.notypecheck M.unbox = is_typecheck and M.unbox or M.nounbox -- Return a boxed version of a binary operation function. -- For the returned function, -- Zero, one, or two of the arguments may be boxed. -- The result value is boxed. -- Example: -- __add = boxed_op(function(a,b) return a+b end) function M.boxed_op(op) return function(a, b) if getmetatable(a) ~= TypedValue then a = M.box(a) end if getmetatable(b) ~= TypedValue then b = M.box(b) end local t = M.box(op(M.unbox(a), M.unbox(b))) local seen = {[t[1]] = true} for _,a2 in ipairs(a) do for _,b2 in ipairs(b) do local c2 = op(a2, b2) if not seen[c2] then t[#t + 1] = op(a2, b2) seen[c2] = true end end end return t end end -- Return a boxed version of a unary operation function. -- For the returned function, -- The argument may optionally be boxed. -- The result value is boxed. -- Example: -- __unm = boxed_uop(function(a) return -a end) function M.boxed_uop(op) return function(a) if getmetatable(a) ~= TypedValue then a = M.box(a) end local t = M.box(op(M.unbox(a))) local seen = {[t[1]] = true} for _,a2 in ipairs(a) do local c2 = op(a2) if not seen[c2] then t[#t + 1] = op(a2) seen[c2] = true end end return t end end TypedValue.__index = TypedValue TypedValue.__add = M.boxed_op(function(a, b) return a + b end) TypedValue.__sub = M.boxed_op(function(a, b) return a - b end) TypedValue.__mul = M.boxed_op(function(a, b) return a * b end) TypedValue.__div = M.boxed_op(function(a, b) return a / b end) TypedValue.__pow = M.boxed_op(function(a, b) return a ^ b end) TypedValue.__mod = M.boxed_op(function(a, b) return a % b end) TypedValue.__concat = M.boxed_op(function(a, b) return a .. b end) -- TypedValue.__le -- not going to work? (metafunction returns Boolean) -- TypedValue.__lt -- not going to work? (metafunction returns Boolean) -- TypedValue.__eq -- not going to work? (metafunction returns Boolean) TypedValue.__tostring = function(self) local str = "[" .. tostring(self.v) .. " in " for i,v in ipairs(self) do if i ~= 1 then str = str .. ", " end str = str .. v end str = str .. "]" return str end -- Convert a regular value into a TypedValue. We call this "boxing". function M.box(v, ...) local t = setmetatable({v = v, ...}, TypedValue) if #t == 0 then t[1] = v end return t end -- Convert a TypedValue into a regular value. We call this "unboxing". function M.unbox(x) assert(getmetatable(x) == TypedValue) return x.v end -- Returns a type predicate function for a given interval over the reals. -- Example: ranged_real'[0,inf)' -- Note: this function could be memoized. function M.ranged_real(name, a, b) local ex = StoredExpression() if name == "(a,b)" then return function(x) return type(x) == "number" and x > a and x < b end elseif name == "(a,b]" then return function(x) return type(x) == "number" and x > a and x <= b end elseif name == "[a,b)" then return function(x) return type(x) == "number" and x >= a and x < b end elseif name == "[a,b]" then return function(x) return type(x) == "number" and x >= a and x <= b end elseif name == "(inf,inf)" then return function(x) return type(x) == "number" end elseif name == "[a,inf)" then return function(x) return type(x) == "number" and x >= a end elseif name == "(a,inf)" then return function(x) return type(x) == "number" and x > a end elseif name == "(-inf,a]" then return function(x) return type(x) == "number" and x <= a end elseif name == "(-inf,a)" then return function(x) return type(x) == "number" and x < a end elseif name == "[0,inf)" then return function(x) return type(x) == "number" and x >= 0 end elseif name == "(0,inf)" then return function(x) return type(x) == "number" and x > 0 end elseif name == "(-inf,0]" then return function(x) return type(x) == "number" and x <= 0 end elseif name == "(-inf,0)" then return function(x) return type(x) == "number" and x < 0 end elseif ex(name:match("^([%[%(])(%d+%.?%d*),(%d+%.?%d*)([%]%)])$")) then local left, a, b, right = ex[1], tonumber(ex[2]), tonumber(ex[3]), ex[4] if left == "(" and right == ")" then return function(x) return type(x) == "number" and x > a and x < b end elseif left == "(" and right == "]" then return function(x) return type(x) == "number" and x > a and x <= b end elseif left == "[" and right == ")" then return function(x) return type(x) == "number" and x >= a and x < b end elseif left == "[" and right == "]" then return function(x) return type(x) == "number" and x >= a and x <= b end else assert(false) end else error("invalid arg " .. name, 2) end end return M
Example usage:
-- type_example.lua -- Test of ExTypeCheck.lua. local TC = require "ExTypeCheck" local typecheck = TC.typecheck local ranged_real = TC.ranged_real local boxed_uop = TC.boxed_uop local box = TC.box -- Checked sqrt function. local sqrt = typecheck(ranged_real'[0,inf)', '->', ranged_real'[0,inf)')( function(x) return boxed_uop(math.sqrt)(x) end ) -- Checked random function. local random = typecheck('->', ranged_real'[0,1)')( function() return box(math.random(), 0, 0.999999) end ) print(box("a", "a", "b") .. "z") print(box(3, 3,4,5) % 4) math.randomseed(os.time()) local x = 0 + random() * 10 - 1 + (random()+1) * 0 print(x + 1); print(sqrt(x + 1)) -- ok print(x); print(sqrt(x)) -- always asserts! (since x might be negative)
Example output:
[az in az, bz] [3 in 3, 0, 1] [8.7835848325787 in 8.7835848325787, 0, 9.99999] [2.9637113274708 in 2.9637113274708, 0, 3.1622760790292] [7.7835848325787 in 7.7835848325787, -1, 8.99999] lua: ./ExTypeCheck.lua:50: assertion failed! stack traceback: [C]: in function 'assert' ./ExTypeCheck.lua:50: in function 'check_type' ./ExTypeCheck.lua:78: in function 'sqrt' testt.lua:30: in main chunk [C]: ?
Note: this approach of values holding multiple values does have some similarities to Perl6 junctions (originally termed "quantum superpositions").
There is a Runtime type checking example in Metalua [2].
The Dao language, partly based on Lua, has built-in support for optional typing [3].
The [Teal language] is typed dialect of Lua that compiles to Lua.
[TypeScriptToLua] is a TypeScript? to Lua transpiler that allows us to write Lua code with TypeScript? syntax and type checking at compile time.
I'd argue that, except for very simple programs or those that always have the same input, disabling type checks in a script is a bad idea. --JohnBelmonte
The comment above was anonymously deleted with "This isn't a forum for drive-by opinions". On the contrary, this is established wiki style. People can make comments on pages, and for points of contention that's considered courteous versus summarily changing the original text. The original author can decide to integrate such comments into the original text or not. --JohnBelmonte