[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: hpairs?
- From: Richard Hundt <richardhundt@...>
- Date: Wed, 29 Dec 2010 17:26:37 +0100
On 12/29/2010 04:28 PM, Luiz Henrique de Figueiredo wrote:
I spend most of my Lua coding time writing compilers which produce
PUC-Rio Lua VM bytecode
Could you please share some details about this or show examples of the
input languages?
I've got two variations: one compiler (on hold at the moment) has a
gradual type system (i.e. type annotations are optional but instead of
type-inference, run-time type checks are inserted when typed and untyped
code mix). The input language for this variation looks like this:
https://github.com/richardhundt/gaia/blob/master/lang/kudu/test/test.js
A more dynamic version has a nominal type system but no static type
checking and looks like this (it's simpler so I'm pushing this one first
until I've got modules and import/export done):
https://github.com/richardhundt/gaia/blob/master/lang/kudu/test/class.js
The syntax is heavily based on the abandoned ECMA4 draft specification
(hence the .js extension), but since it uses LPeg it's pretty easy to
change.
The interesting stuff for you is probably the code generator:
https://github.com/richardhundt/gaia/blob/master/src/gaia/codegen.lua
Register allocation is still a bit sketchy and there are a few other
TODO's, so it's not mature. You can see the code generator being used by
the compiler:
https://github.com/richardhundt/gaia/blob/master/lang/kudu/src/compiler.lua
It does this sort of thing:
Ops{
Call{ Id"require", String"kudu.runtime" };
Call{ Index{ Id"kudu", String"load" } };
Local{ { Id"this" }; Index{ Id"kudu", String"null" } };
...
Call{ Id"(init)" };
Return{ Id"__package__" };
}
which is a sort of table based op tree inspired by metalua syntax, which
gets converted to bytecode by the code generator.
Also of interest might be the grammar (minus the type expression, these
have been factored out for now):
https://github.com/richardhundt/gaia/blob/master/lang/kudu/src/grammar.lua
It uses an LPeg wrapper which is specialized for (procedural)
programming languages, so it has some shortcuts for creating commonly
used expression matching patterns, so you can say this sort of thing
(where "p" is the parser object):
local expr_base = p:express"expr_base" :primary"term"
expr_base:op_infix("**"):prec(35)
expr_base:op_prefix"new":prec(40)
expr_base:op_postfix("++", "--"):prec(35)
expr_base:op_ternary"?:":prec(2)
expr_base:op_circumfix"()":prec(50)
It's all pretty rough at the moment (and the parser library is arguably
overcooked), but this whole project is something of an experiment to see
how far the goals of the Parrot VM can be achieved on the Lua VM.
It's interesting for me because Lua has gone in the exact opposite
direction to Parrot. Whereas Parrot tries to cram in every conceivable
feature for every conceivable language, Lua has reduced everything to
the minimum and kept performance high enough that the object model,
native types, etc. can be implemented using tables and userdata, and
still perform well.
Also, can't you emit Lua source code instead of bytecode? As you
probably know, bytecode is not portable across versions of Lua;
e.g., 5.1 bytecode does not run in 5.2 and vice-versa.
There are two reasons for going to bytecode. The first is that Lua the
language doesn't let you do arbitrary branching, so a language with a
'continue label' construct is hard to transform to Lua source. The other
reason is that when producing bytecode you've got control over the debug
data (line numbering, variable names etc.) which you see in stack traces.
If Lua had goto label and line number hinting (like Perl's during string
eval), then I'd have stuck to generating Lua source.
With the Lua AST layer in between, it's possible to swap out back-ends
to produce different bytecode. Supporting standard 5.2 bytecode is next
and pretty easy, since the differences aren't huge. LJ2 would be nice,
but more challenging, and I'm not sure if this project has the legs to
get that far. I'd need a killer language for it, and I've discovered
that designing the language is harder than implementing it.
Other potential avenues to explore would be Linear Genetic Programming
(LGP) [1] with the Lua VM, where one would use the bytecode generator to
create and mutate the program population. I think the Lua VM has real
potential here because of the small instruction set and limited number
of types, and its footprint is small enough to run tournaments over
pretty large populations.
Anyway, that's my happy relationship with Lua as it stands, and of any
of this interests anyone besides me, I'd be happy to elaborate.
Cheers,
Richard
[1] https://eldorado.tu-dortmund.de/handle/2003/20098