On 25/06/2012 18.46, KHMan wrote:
It probably arrived at similar sizes due to different
mechanisms. File
header overhead (and deflate method overheads) is also
significant for
small files. I meant that it should not be any more compressible
than an
equivalent string of digits output by a strong encryption method.
sieve.number is compressed mainly by reducing the length of
codewords to
under 4 bits per symbol due to 10 symbols (digits), while the
stream of
digits itself has no real pattern and cannot be compressed via
LZ coding.
I looked at the sizes inside a single zip file to reduce header
overload, but I agree it's probably just a nice coincidence. It
would be interesting to confirm this by repeating the test on
large files.