[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Understanding 'volatile' in lstrlib.cpp (Lua 5.3)
- From: Andrew Gierth <andrew@...>
- Date: Thu, 01 Oct 2020 07:48:59 +0100
>>>>> "Roberto" == Roberto Ierusalimschy <roberto@inf.puc-rio.br> writes:
Roberto> In ANSI C, you cannot assign to a field in an union and read
Roberto> from another field, which is exactly the purpose of Ftypes.
Roberto> The "volatile" avoids optimizations based on that knowledge,
Roberto> ensuring that the code executes "as expected"
We don't find this qualifier to be necessary in PostgreSQL, which uses
exactly this trick when storing float values into the generic Datum type
(which is declared as uintptr_t) and vice-versa:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/include/postgres.h;h=c48f47e930ad05d3ae2e24b0c0c85662cd058a7b;hb=HEAD#l693
static inline float8
DatumGetFloat8(Datum X)
{
union
{
int64 value;
float8 retval;
} myunion;
myunion.value = DatumGetInt64(X);
return myunion.retval;
}
static inline Datum
Float8GetDatum(float8 X)
{
union
{
float8 value;
int64 retval;
} myunion;
myunion.value = X;
return Int64GetDatum(myunion.retval);
}
Both of these compile (on amd64 with any modern compiler) to simple move
instructions. For more strictness we could have used uint64, but in
practice we don't support platforms where ints are not 2s-complement so
this is moot.
The spec allows for the possibility of such conversions to generate
"trap representations" of the destination type (for example
reinterpreting an integer as a float might result in a signalling NaN,
with an exception raised if the value is ever used). But it explicitly
_does not_ declare this to be undefined behavior; what it says is that
the value of the stored type is reinterpreted as being of the accessed
type. Note especially that the type 'unsigned char' is not allowed to
have trap representations and that accessing a trap representation of
any type via lvalues of character type is _not_ undefined behavior.
Incidentally, the spec also explicitly makes it legal to access the
representation of any object by aliasing it as a character type. So it
is legal for example to use memcpy(charptr, &floatvar, sizeof(floatvar))
to copy the representation (and modern compilers will generally optimize
this to a single move instruction too).
In short, the qualifier is not required either by the spec or (as far as
I can determine) actual compilers; if you have an example where a
problem actually occurred with this I would be interested to know
details.
The volatile qualifier likely heavily pessimizes the code here, because
it forces the copywithendian loop to actually go byte-by-byte to memory,
which will cause a delay due to failure of store-forwarding; whereas
without a volatile qualifier the compiler will usually optimize it to a
byteswap instruction (where available) or a vectorized shuffle.
For example, on clang 10, amd64, -O2, this code:
#include <stdbool.h>
#include <stddef.h>
#include <string.h>
static void copy_endian(unsigned char *dst,
unsigned char *src,
size_t sz,
bool swap)
{
if (swap)
for (dst += sz; sz > 0; --sz)
*--dst = *src++;
else
memcpy(dst, src, sz);
}
void emit_dbl(unsigned char *ptr, double d, bool swap)
{
copy_endian(ptr, (unsigned char *) &d, sizeof(d), swap);
}
compiles to this:
emit_dbl: # @emit_dbl
test sil, sil
je .LBB1_2
movq rax, xmm0
bswap rax
mov qword ptr [rdi], rax
ret
.LBB1_2:
movsd qword ptr [rdi], xmm0
ret
--
Andrew.