[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: [ANN] MD5 1.0.2 Released (OT)
- From: Steve Heller <steve@...>
- Date: Thu, 10 May 2007 07:53:52 -0500
On Thu, 10 May 2007 02:02:04 -0400, Dave Dodge <dododge@dododge.net>
wrote:
>On Wed, May 09, 2007 at 04:19:17PM +0200, Philippe Lhoste wrote:
>> One (frequent?) use of MD5 is to compute a hash value for files and use
>> it for fast comparison (duplicates, is this file changed?, and so on).
>> I suppose that for this use, it is still OK, at worse involving a binary
>> comparison to be sure (for some uses).
>
>Aside: The Plan9 OS has a filesystem called "Venti", based on the idea
>that each block of data in a stored file can be indexed by its hash
>value. This allows it to store only one copy of the block's data, for
>any number of files or copies of files that contain that data. It's
>intended for archival storage. This design requires that every unique
>block of data, in every file in the filesystem, has a unique hash
>value. A collision results in data loss.
>
> "Using the Sha1 hash function, the probability of a collision is
> less than 10^-20. Such a scenario seems sufficiently unlikely that
> we ignore it [...]"
>
>http://plan9.bell-labs.com/sys/doc/venti.html
For how big a filesystem? If you have enough blocks, the probability
of a collision will be 1.
Steve