Well, one number I found on the decrease of Direct3D's speed with and
without the FPU preserve flag:
http://discuss.microsoft.com/SCRIPTS/WA-MSD.EXE?A2=ind0504b&L=directxdev&D=1&P=3121
with: 560 fps
without: 580 fps
However I think it is a bit beside the point to 'prove' this with
numbers since DirectX more or less already chose single precision for us
(for a good reason, I trust). Also it seems logical for a 3D API to be
faster when using float-s in stead of double-s because twice the data
can be pushed to the GPU with the same bandwidth / stored in VRAM. Isn't
this the same for OpenGL?
Looking at the performance of double vs float on modern CPU-s should be
interesting though. Are double-s faster, slower or the same compared to
float-s on 32-bit and 64-bit CPU architecture? What about the CPU-s
people are actually using on average at the moment? (to sell games we
need to look at what is average on the market, not only to what is
top-notch :)
Cheers,
Hugo