Table Scope |
|
Described here are various solutions to allow table constructs to enclose a scope so that variables used inside the table have special meaning. In other words, we want something like this:
local obj = struct "foo" { int "bar"; string "baz"; } assert(obj == "foo[int[bar],string[baz]]") assert(string.lower("ABC") == "abc") assert(int == nil)
Lua is good for data definition (as discussed in Programming in Lua):
local o = rectangle { point(0,0), point(3,4) } -- another example: html{ h1"This is a header", p"This is a paragraph", p{"This is a ", em"styled", " paragraph"} } -- another example: struct "name" { int "bar"; string "baz"; }
The syntax is nice. The semantics are less so: string is a global variable scoped outside of struct
, even though it only needs to have meaning inside of struct
. This can create problems. For example, the above requires redefining the standard table string
.
Requiring a prefix will solve the problem but can be cumbersome and ugly, being foreign to the problem domain that data definition intends to describe:
local struct = require "struct" ... local o = struct "name" { struct.int "bar"; struct.string "baz"; }
We could restrict the scope with locals, but defining them can become cumbersome too, especially if your data definition language contains hundreds of tags:
local struct = require "struct" ... local o; do local int = struct.int local string = struct.string o = struct "name" { int "bar"; string "baz"; } end
In fact, we might want some word to have different meanings depending on nesting context:
action { point { at = location { point(3,4) } }
An alternative might be
struct "name" { int = "bar"; string = "baz"; }
in which case int
and string
are now strings rather than global variables. However, here the ordering and multiplicity of the arguments is lost (which are important in structs). Another alternative is
struct "name" . int "bar" . string "baz" . endstruct
Semantically, that's better (except one must not forget the endstruct
), but the dot syntax is somewhat unusual. Semantically we might want something that is like an S-expression:
struct { "name", {"int", "bar"}, {"string", "baz"} }
but syntactically that is lacking.
There is a "table scope" patch that allows this type of thing:
local function struct(name) local scope = { int = function(name) return "int[" .. name .. "]" end; string = function(name) return "string[" .. name .. "]" end } return setmetatable({}, { __scope = scope; __call = function(_, t) return name .. "[" .. table.concat(t, ",") .. "]" end }) end local obj = struct "foo" { int "bar"; string "baz"; } assert(obj == "foo[int[bar],string[baz]]") assert(string.lower("ABC") == "abc") assert(int == nil) print "DONE"
Download patch: [tablescope.patch] (for Lua 5.1.3)
The patch makes no change to Lua syntax nor Lua bytecodes. The only change is to support of a new metamethod named __scope
. If a table construct is used as the last argument of a function call, and the object being called contains a __scope
metamethod that is a table, then global variables mentioned inside the table construct are first looked up in __scope
. Only if not found in __scope
is a variable then looked up in the environment table as usual.
The patch makes some assumptions about the order of bytecodes to infer how tables nest global variable accesses. It's possible these assumptions would not be met if the byte codes were not compiled by luac (e.g. compiled by MetaLua). However, the effect of not meeting these assumptions in a certain function would generally be that table scoping is simply not applied to that function, though there may be very unusual cases having security implications.
There could be ways to reduce the performance impact on global accesses using this patch. Suggestions are welcome. For example, the table scoping lookup could be selectively enabled or disabled on a specific function. If performance is a concern, you should be using local variables anyway.
MetaLua generates the exact same bytecode as luac, unless you use Goto or Stat. Besides, if you were using MetaLua, you'd also use it to handle table scopes rather than patching Lua -- FabienFleutot
Avoiding patches, we can do some tricks with the Lua environment table. The following pattern might be used (original idea suggested by RiciLake):
-- shapes.lua local M = {} local Rectangle = { __tostring = function(self) return string.format("rectangle[%s,%s]", tostring(self[1]), tostring(self[2])) end } local Point = { __tostring = function(self) return string.format("point[%f,%f]", tostring(self[1]), tostring(self[2])) end } function M.point(x,y) return setmetatable({x,y}, Point) end function M.rectangle(t) local point1 = assert(t[1]) local point2 = assert(t[2]) return setmetatable({point1, point2}, Rectangle) end return M
-- shapes_test.lua -- with: namespace, [level], [filter] --> (lambda: ... --> ...) function with(namespace, level, filter) level = level or 1; level = level + 1 -- Handle __with metamethod if defined. local mt = getmetatable(namespace) if type(mt) == "table" then local custom_with = mt.__with if custom_with then return custom_with(namespace, level, filter) end end local old_env = getfenv(level) -- Save -- Create local environment. local env = {} setmetatable(env, { __index = function(env, k) local v = namespace[k]; if v == nil then v = old_env[k] end return v end }) setfenv(level, env) return function(...) setfenv(2, old_env) -- Restore if filter then return filter(...) end return ... end end local shapes = require "shapes" local o = with(shapes) ( rectangle { point(0,0), point(3,4) } ) assert(not rectangle and not point) -- note: not visible here print(o) -- outputs: rectangle[point[0.000000,0.000000],point[3.000000,4.000000]]
The key is the with
function, which provides local access to a given namespace. It is similar in purpose to the "with" clause in some other languages like VB and somewhat related to using namespace
in C++ or import static
[1] in Java. It might also be similar to XML namespaces.
The following special case correctly outputs the same result:
point = 2 function calc(x) return x * point end local function calc2(x) return x/2 end local o = with(shapes) ( rectangle { point(0,0), point(calc2(6),calc(2)) } ) print(o) -- outputs: rectangle[point[0.000000,0.000000],point[3.000000,4.000000]]
The optional arguments to with
can be useful when defining wrappers to simplify expressions:
function shape_context(level) return with(shapes, (level or 1)+1, function(x) return x[1] end) end local o = shape_context() { rectangle { point(0,0), point(3,4) } } print(o) -- outputs: rectangle[point[0.000000,0.000000],point[3.000000,4.000000]]
Further simplification is possible by automatically invoking with
when a global key is accessed:
setmetatable(_G, { __index = function(t, k) if k == "rectangle" then return with(shapes, 2, function(...) return shapes.rectangle(...) end) end end }) local o = rectangle { point(0,0), point(3,4) } print(o) -- outputs: rectangle[point[0.000000,0.000000],point[3.000000,4.000000]]
One caveat is that approach relies on undocumented Lua behavior. The function name with
must be resolved before the arguments to the function are resolved, which is the behavior in the current release of Lua 5.1.
Also, though the desire is for with
to provide a type of lexical scoping, and it simulates it fairly well, the implementation is actually more dynamic. The following will cause a run-time error since real lexicals (locals) override globals:
local point = 123 local o = with(shapes) ( rectangle { point(0,0), point(3,4) } )
Further, we assume that the environment of the caller can be temporarily replaced with no ill-effects.
Leaving off the final call (accidentally) will not necessarily result in an error but will leave the environment changed:
local o = with(shapes) assert(rectangle) -- opps! rectangle is visible now.
But that does suggest this usage is possible (though not necessarily a good idea):
local f = with(shapes) -- begin scope local o1 = rectangle { point(0,0), point(3,4) } local o1 = rectangle { point(0,0), point(5,6) } f() -- end scope assert(not rectangle) -- rectangle no longer visible
Another problem is if there are exceptions in evaluating the arguments, the environment will not be restored:
local last = nil function test() last = rectangle local o = with(shapes) ( rectangle { point(3,4) } ) -- raises error end assert(not pcall(test)) assert(not last) assert(not pcall(test)) assert(last) -- opps! environment not restored
Unfortunately, there doesn't seem to be any unobtrusive way to wrap the arguments in a pcall
. We can do it using a new pwith
function that accepts the data wrapped in the awkward function() return ... end
syntax for later evaluation by a pcall
:
local o = pwith(shapes)(function() return rectangle { point(3,4) } -- raises error end)
In fact, this approach can avoid the issues of reliance on undocumented behavior, danger of leaving off the second call, and the danger of touching the caller's environment as discussed above. We just need to live with the syntax (see also the "Global Collector" pattern above, which applies a similar syntax and semantics).
Another variation is to invoke the pwith
after the fact (e.g. outside of the configuration file):
rect = function() return rectangle { point(3,4) } end ... pwith(shapes)(rec)
or perhaps pwith
can be triggered upon a __newindex
metamethod event on _G
.
The non-pcall form may be ok. Just note the contract for using it: if it raises an exception, throw away the function that called it. This may well be a good approach for a configuration language.
Note, however, that the above approaches do not work that well with some methods of DetectingUndefinedVariables. rectangle
and point
may be identified as undefined variables, particularly under static checks for undefined globals (these are global/environment variables not defined in the top-level script).
If we don't need particular access to upvalues, we can stringify the above data function (see "Stringified Anonymous Functions" pattern in ShortAnonymousFunctions details):
local o = pwith(shapes)[[ rectangle { point(x,y) } ]]{x = 3, y = 4}
We loose direct access to lexicals in the caller, but pwith
could prepend locals to the data string so that rectangle
and point
(as well as x
and y
become lexicals. pwith
could implement it like this provided the maximum lexical limit is not reached:
local code = [[ local rectangle, point, x, y = ... ]] .. datasttring local f = loadstring(code)(namespace.rectangle, namespace.point, x, y)
Here's an implementation in Metalua:
-- with.lua function with_expr_builder(t) local namespace, value = t[1], t[2] local tmp = mlp.gensym() local code = +{block: local namespace = -{namespace} local old_env = getfenv(1) local env = setmetatable({}, { __index = function(t,k) local v = namespace[k]; if v == nil then v = old_env[k] end return v end }) local -{tmp} local f = setfenv((|| -{value}), env) local function helper(success, ...) return {n=select('#',...), success=success, ...} end let -{tmp} = helper(pcall(f)) if not -{tmp}.success then error(-{tmp}[1]) end } -- NOTE: Stat seems to only support returning a single value. -- Multiple return values are ignored (even though attempted here) return `Stat{code, +{unpack(-{tmp}, 1, -{tmp}.n)}} end function with_stat_builder(t) local namespace, block = t[1], t[2] local tmp = mlp.gensym() local code = +{block: local namespace = -{namespace} local old_env = getfenv(1) local env = setmetatable({}, { __index = function(t,k) local v = namespace[k]; if v == nil then v = old_env[k] end return v end }) local -{tmp} local f = setfenv(function() -{block} end, env) local success, msg = pcall(f) if not success then error(msg) end } return code end mlp.lexer.register { "with", "|" } mlp.expr.primary.add { "with", "|", mlp.expr, "|", mlp.expr, builder=with_expr_builder } mlp.stat.add { "with", mlp.expr, "do", mlp.block, "end", builder=with_stat_builder }
Example usage:
-{ dofile "with.luac" } local shapes = require "shapes" rectangle = 123 -- no problem local o = with |shapes| rectangle { point(0,0), point(3,4) } print(o) --outputs: rectangle[point[0.000000,0.000000],point[3.000000,4.000000]] local o with shapes do o = rectangle { point(0,0), point(3,4) } end print(o) --outputs: rectangle[point[0.000000,0.000000],point[3.000000,4.000000]] local other = {double = function(x) return 2*x end} local o = with |other| with |shapes| rectangle { point(0,0), point(3,double(4)) } print(o) --outputs: rectangle[point[0.000000,0.000000],point[3.000000,8.000000]] local o local success, msg = pcall(function() o = with |shapes| rectangle { point(0,0) } end) assert(not success) assert(rectangle == 123) -- original environment
Update (2010-06): The setfenv
above can be avoided, as would be necessarily in Lua 5.2.0-work3 (see below).
Lua-5.2.0-work3 removes setfenv
(though preserves partial support for it in the debug library). This invalidates many of the "Environment Trick" techniques above, although they weren't that good anyway.
In Lua-5.2.0-work3, _ENV
would permit
local o = (function(_ENV) return rectangle { point(0,0), point(3,4) } end)(shapes)
That has similar qualities to the above 5.1 solution "pwith(shapes)(function() return ..... end)
but doesn't need a pcall
. The extra function above is only used to introduce a lexical (_ENV
) in the middle of an expression, but we could remove that function if we define _ENV
in a separate statement:
local o; do local _ENV = shapes o = rectangle { point(0,0), point(3,4) } end
That could be cleaner syntactically, but it's not that bad, unless we need to switch environments a lot (e.g. evaluate the arguments of shapes, rectangle, and point each in their own environments).