[Omake] Lm_hash: hash_length is different from the length of
hash_data?
Jason Hickey
jyh at cs.caltech.edu
Tue Mar 20 10:50:02 PDT 2007
On Mar 20, 2007, at 10:06 AM, Aleksey Nogin wrote:
> No, not quite.
>
> The relevant code is:
>
> let code = (...) mod hash_length in
> for i = 0 to digest_length - 1 do
> digest.(i) <- ... lxor (Array.unsafe_get hash_data
> (code + i))
> done;
>
> So unless I am missing something again, the hash_data needs to have
> the length of hash_length + digest_length - 1, which is 6232, and
> the last 17 entries are unused.
That's right. Basically it means you have some space to tweak
digest_length.
>
> ---
>
> Another question - in the two versions (Code, Digest) of the
> add_bits function we start with
>
> ... (code + i + 1) land 0x3fffffff ... or
> ... (code + i + digest_length) land 0x3fffffff ...
>
> Given that code is between 0 and hash_length - 1, and i is a byte,
> the sum will always be small - so is there any purpose to having
> the land operation here? Do you anticipate calling add_bits from
> somewhere else (outside of the Lm_hash) with larger inputs?
>
It is just defensive programming--there is no reason to prevent it
being called with large i.
> ---
>
> Also - is there any reason why in HashDigest we maintain a 16-ints-
> long hash_digest array and then only use the last byte off each
> int? As opposed to, say, using a 8-ints-long one and using the last
> 2 bytes? Wouldn't that be faster?
There isn't a strong reason. The 16 version is probably more random,
but it may be insignificant.
>
> ---
>
> Finally, is the HashDigest actually used somewhere? As far as I can
> tell, it is not used on the current 0.9.8.x branch, perhaps it's
> used somewhere in the 0.9.9.x stuff?
It is there as a possible replacement for Digest. It may be that
HashDigest is faster than Digest, but I haven't done any benchmarking.
Jason
--
Jason Hickey http://www.cs.caltech.edu/~jyh
Caltech Computer Science Tel: 626-395-6568 FAX: 626-792-4257
More information about the OMake-Devel
mailing list