|
Thank you, Bogdan.I won't touch the patch on non-x86 platforms, unless bugs are found.
What I plan to try on x86, when configured for double+int32 is to let any arithmetics performed on integers to "drop" to FP immediately, since they will not lose accuracy doing so. Ints would still be there, some places s.a. for counters and table indices might benefit of having them.
On the one side, it feels funny to optimize for one platform so much. On the other, >90% of users are on x86 and if the patch can be adequate there as well, it would be nice. The goal is to avoid two Lua code bases.
-asko On Thu, 27 Mar 2008 12:07:26 +0200 "Bogdan Marinescu" <bogdan.marinescu@gmail.com> wrote:
You think it's possible to avoid the checks? And even if you could ... don't get me wrong, I know it's your patch :), but I see it much more applicable to "regular" (non-FPU) embedded systems than to desktops. This is were it really shines. And I intend to do something to prove this, but it will take me a while. Truth be told, I already did something to prove it, but it was just a proof of concept. And of course, when I did that, I didn't have "LNUM", I just both number types to "int" and got me a fixed-point only Lua. Not exactly a tragedy in most microcontroller-based systems, but still, LNUM is so much better. This is why I intend to expand the proof of concept, and this is why I'm so intereted in this patch. I think it's THE way for Lua to go embedded. And I don't mean high-end-OS-based-systems embedded (likeplua), I mean low-end-microcontroller embedded.On Thu, Mar 27, 2008 at 11:49 AM, <askok@dnainternet.net> wrote:Thanks for the clarification.The reason for LNUM slowing modern x86 is maybe just thatinteger calculations (s.a. simple increment) need to be range checked to find out potential falling to FP realm.For a FP, any operation can just be done, without checks.I'll look for a neato way to bypass this, so x86 double+int32 users won't be hit by the patch (if it gets in some day). -asko On Thu, 27 Mar 2008 10:34:54 +0100 David Kastrup <dak@gnu.org> wrote: > Asko Kauppi <askok@dnainternet.net> writes: >>> In my view, Mike's results are in line with the Core Duo>>results I had >> measured. >> >> What cannot be done is treat x86 as a single >>optimization target. It >> is not. >>>> Then again, modern x86's are utterly fast on FP, which>>they calculate >> as fast as integers. >> No, they don't. Integer arithmetic can be parallelized>much more > thoroughly. But when we are working with a byte-code >interpreter, the> parallelization will speed up the byte code interpreter>itself, but> hardly the arithmetic fed into it: the interpreter will>not issue > arithmetic fast enough to make a difference. > > -- > David Kastrup >