[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: integer implementation of mod
- From: Rici Lake <lua@...>
- Date: Thu, 19 May 2005 11:01:32 -0500
The following code is hereby released into the public domain (although
I'm never adverse to getting credit):
static inline lua_Number imod_aux(lua_Number n, lua_UNumber absm) {
lua_UNumber mask;
lua_Number res;
mask = absm - 1;
if ((absm & (absm ^ mask)) == absm) {
res = n & mask;
if (absm == 0) overflow_error();
}
else if (n < 0)
res = mask - (lua_UNumber)(~n) % absm;
else
res = (lua_UNumber)n % absm;
return res;
}
lua_Number imod(lua_Number n, lua_Number m) {
if (m < 0) {
lua_Number r = imod_aux(n, (lua_UNumber)(-m));
return r ? m + r : r;
}
else
return imod_aux(n, (lua_UNumber)m);
}
------------
This has the same behaviour with respect to signs as the work6
implementation of %; specifically, p % q is either in the range [q+1,
0] or [0, q-1] depending on whether q is negative or positive. I'm not
standing up for this particular definition, by the way, just trying to
be consistent.
The code has been carefully optimised for the case of p % 2^k, but a
check against an implementation based on the Python implementation
(which uses / and *) reveals that it is still faster on both x86 and
ppc32 architectures, significantly faster in the former case.
Some explanation is possibly in order:
m xor m-1 effectively turns all trailing 0's in the binary
representation of m into 1's, and turns all bits to the left of the
rightmost 1 into 0's. Consequently, m bitand (m xor m-1) is exactly
equal to m iff m is a power of 2. If m is a power of 2, then n mod m
can be computed by bitand'ing n with m-1.
The ~n in line 10 is simply based on: ~n === -(n + 1). The version of
gcc I used didn't find this optimisation, for some reason. The point of
line 10 was to avoid an extra test for a zero result; I couldn't come
up with a workable formulation of the same optimisation in line 18, and
the result is evident from timing tests.
I'm not sure that the use of unsigned % is actually necessary; in fact,
both arguments to the % operator in both lines 10 and 12 are guaranteed
to be positive. However, I had at the back of my mind the thought that
unsigned mod might produce better compiled output on some
architectures. That might be completely wrong, though.