lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On 11/07/2019 23.15, Egor Skriptunoff wrote:
A new syntax:
    with white-list-of-names do
       ....
    end
This statement makes all names from the outer scope (with exception of white-listed names) invisible in the inner block.

Interesting idea, might be fun.

But these two things in combination create a problem:

The statement works for all names independently of what a name is: local/upvalue/global.

The white-list must contain only visible names, otherwise
compilation error is raised.

What about to-be-defined "globals"?  (Will you have to "pre-declare" them?)

And what about this?

  Point2D.length =
    function( _ENV )  with x, y do  return (x^2+y^2)^0.5  end  end

(which could (more or less) equivalently be written as

  function Point2D:length( )  return (self.x^2+self.y^2)^0.5  end

but is shorter / more readable this way and already (without with)
gives the guarantee that you'd at most pollute the object you're
working on, not the outer environment.)

Or even

  do
    local print = print
    local _ENV = { x = 23 }
    function _G.foo( )  with print, x do  print( x )  end  end
    -- up to this point you cannot even tell that _ENV might change
    -- again (so this would break Lua's single-pass compilation)
    function _G.bar( newenv )  _ENV = newenv  end
    -- only now it's clear that _ENV might be changed
  end
  bar { y = 5 }
  foo ()

Because _ENV can change dynamically, you cannot determine at compile
time what names will be valid, so you cannot always raise a compile time
error.

Sergey Kovalev complained recently about lack of syntax for defining "pure functions" in Lua.
Due to "with" statement we could define them easily (3 variants):

    Variant #1

       local function pure_func(x, y)
         with x, y do    -- if the function must be recursive, add its name to this white-list
             -- all upvalues and globals are inaccessible here
             ...
          end
       end

This isn't ideal for "syntactical sandboxing" (you have to look inside
the function to check that it's safe) and you have to write all
arguments twice…

    Variant#2

       local pure_func
      with pure_func do  -- we need to include something in the white-list to be able to pass the constructed function value outside the block
          function pure_func(x, y)
             -- all upvalues and globals are inaccessible here
             ...
          end
       end

This is awfully verbose.  (You have to give the name THREE times!?!)

    Variant#3

      with do      -- white-list is empty, the function value is passed outside the block by "return" statement
          local function pure_func(x, y)
             -- all upvalues and globals are inaccessible here
             ...
          end
          return pure_func
       end

This needs an outer (function() … end)() wrapper or has to live in its
own file, again awfully clunky.

I'd prefer that to work like

  local x, y, z with a, b, c do
    … (both x,y,z AND a,b,c are visible here) …
  end

(but I already have local x,y,z do … end blocks in my code.)  Because it
wouldn't be possible to disambiguate that from

  local x, y, z;
  with a, b, c do … end

without an explicit semicolon, I'd restrict it to always have the
`'local' <NAMELIST> 'with' <NAMELIST> 'do' <BLOCK> 'end'` form.  That
would force blocks with purely external side-effects to take the form

  local _ with io, self, … do  …block…  end

which I personally wouldn't mind… but at that point I'm left wondering:
What exactly is gained over the current possibilities of _ENV?

I start my modules (whether they're files or blocks) with something like

  -- (_M is the exposed module, _MENV / _ENV is the environment)
  local _ENV = setmetatable( { _M = { } }, { __index = _ENV or _G } )
  if setfenv then  setfenv( 1, _ENV )  end
  _M._MENV = _ENV          -- be REPL-/testing-/monkey-patching-friendly
  package.loaded[...] = _M -- no `return _M` needed

which is 5.1-thru-(probably)-5.4-compatible.  Because everything has
_ENV, there's essentially no place where actual global variables can
accidentally be created – everything ends up in a "local environment" at
worst.  Also, (unless I need that tiny bit of extra speed for
recursion,) I don't `local` my internal functions, I intentionally let
them go to the module's environment (for easier monkey patching,
testing, etc.).  Tests can easily check whether any "local global" was
accidentally changed (per module, snapshot the initial foo._MENV after
loading, run all the tests, compare with final state & flag any changes
not explicitly whitelisted – this also catches updates to existing
fields, unlike __newindex-based stuff.)

Instead of going { __index = _ENV or _G } for a new _ENV, I could be
more specific, which would get me the white-listing of "globals".  And
because I don't have to `local` the module-internal state (as I have an
actual environment), I actually don't have that many local variables
outside of functions (and those that exist are `do`…`end`-scoped), so I
don't need the hiding of locals at all.  (Stuff that I want to sandbox
goes into its own file anyway and gets its _ENV set via load(), so I
have the guarantee that there won't be any accidental upvalues / locals.)

So… I guess… have you tried working with _ENV before?  Why was it
insufficient for your purposes?



How "with" statement should work with "_ENV" name:
1) If "_ENV" is written explicitly (user wants to directly access the upvalue "_ENV") then such "_ENV" is visible only if it's in white-list. 2) If "_ENV" is used implicitly (user accesses global variable "var", and the preprocessor replaces "var" with "_ENV.var") then this implicit "_ENV" is always accessible, even when "_ENV" is not white-listed.

This just sounds scary.

I believe the identifier "with" is used not very frequently in Lua programs because it is not a noun/adjective/verb, so there would be not very much old Lua code broken by introducing this new keyword.

Seems correct, I have an awful lot of zipWith, withOpenFile and other
withFOOs… but only two places where a function parameter is called
'with'… (not related to the above functions).

-- nobody