[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Drawing the line between speed and simplicity/elegance
- From: "Soni L." <fakedme@...>
- Date: Thu, 07 May 2015 19:46:25 -0300
On 07/05/15 12:38 AM, Brigham Toskin wrote:
Greetings.
I apologize if this is kinda long; I'll try to compress. First, my
situation:
I've been working on a stack-based language, implemented in Lua. Being
a C++ programmer, my first impulse was to wrap an array table with
some ADT metamethods. After some futzing, my stack relatively fast,
but fairly complicated for what it ultimately is—a LIFO.
Recently, someone pointed out that you could do it with much less code
by wrapping Lua's call stack in a coroutine, manipulating and
returning values in response to external inputs. The simplicity of the
design and the realization that the Lua devs must have implemented a
much faster stack than I ever could lead me to explore this space. To
my surprise, my first prototype was actually 50% slower than the
original, under Lua 5.2.3. After thinking about what was going on and
profiling several iterations, I have what is still a simple and (I
think) very clean and elegant solution, utilizing a hand full of
mutually-tail-recursive continuations inside a coroutine, and it's
about 13% faster than the ADT!
Sounds like a win, right? If I run the tests in LuaJIT (2.0.2 or
2.0.3), the newest prototype is even five times faster than under
vanilla Lua. But, it's two orders of magnitude slower than the jit'ed
ADT-style code. Now, this is still an improvement over the first
prototype, which was *three* orders of magnitude slower than the
jit'ed ADT code, but it still ain't great. I very strongly suspect
(after looking at the -jdump) that the heavy use of switching
coroutine contexts is foiling the compiler's ability to trace (and
thus, optimize) the code, and I don't see a fix.
The very specific question: Do we see a workaround, optimization, or
perhaps an alternative implementation, which circumvents what I think
is a limitation of how LuaJIT analyzes Lua code? I can provide github
links to different versions of my code, if anyone thinks it will help,
but I'm pretty sure "it's a coroutine" is a good starting place.
The more general question: Where do we draw the line between writing
simple code, and performance? Or phrased another way, how slow is too
slow, for the sake of an elegant design? When I optimized the ADT
code, it got uglier and more complex. When I optimized the coroutine
prototype, it got simpler and more elegant.
--
Brigham Toskin
Looks like you're talking about me...
First of all you should AVOID AT ALL COSTS using coroutines in Lua(JIT):
they're slow. As can be seen here[1], I don't use them, so you too can
avoid them.
Second, there's no arguing against KISS. This is how I call Lua
functions[2]. To make a wrapper around a Lua function you just
`function(word, i, arg1, arg2, arg3, etc, ...) local v1, v2, v3 =
f(arg1, arg2, arg3, etc) return word, i, v1, v2, v3, ... end`, and
that's it!
Third... avoid `select(var, ...)`, LuaJIT doesn't like it.
[1]: https://github.com/SoniEx2/Stuff/blob/master/lua/Forth/VM.lua#L3-L21
[2]: https://github.com/SoniEx2/Stuff/blob/master/lua/Forth/VM.lua#L13
--
Disclaimer: these emails are public and can be accessed from <TODO: get a non-DHCP IP and put it here>. If you do not agree with this, DO NOT REPLY.