hash algorithm
Jon Zeeff
zeeff at b-tech.UUCP
Fri Aug 19 10:00:26 AEST 1988
In article <2550078 at hpisod2.HP.COM> decot at hpisod2.HP.COM (Dave Decot) writes:
>> could anyone recommend a hashing algorithm to store a list of 10
>> digit phone numbers (area code plus 7 digits).
>Treat the ten digits as an integer, n, and compute the index
>as (n % HASH_SIZE), where HASH_SIZE is the size of your hash table.
>
I don't second Dave's recommendation. What seems like a fairly random
hash function usually doesn't work very well when you actually measure
it's performance. Here is one stolen from pathalias that works well.
/* This is a simplified version of the pathalias hashing function.
* Thanks to Steve Belovin and Peter Honeyman
*
* hash a string into a long int. 31 bit crc (from andrew appel).
* the crc table is computed at run time by crcinit() -- we could
* precompute, but it takes 1 clock tick on a 750.
*
* This fast table calculation works only if POLY is a prime polynomial
* in the field of integers modulo 2. Since the coefficients of a
* 32-bit polynomial won't fit in a 32-bit word, the high-order bit is
* implicit. IT MUST ALSO BE THE CASE that the coefficients of orders
* 31 down to 25 are zero. Happily, we have candidates, from
* E. J. Watson, "Primitive Polynomials (Mod 2)", Math. Comp. 16 (1962):
* x^32 + x^7 + x^5 + x^3 + x^2 + x^1 + x^0
* x^31 + x^3 + x^0
*
* We reverse the bits to get:
* 111101010000000000000000000000001 but drop the last 1
* f 5 0 0 0 0 0 0
* 010010000000000000000000000000001 ditto, for 31-bit crc
* 4 8 0 0 0 0 0 0
*/
#define POLY 0x48000000L /* 31-bit polynomial (avoids sign problems) */
static long CrcTable[128];
static void
crcinit()
{ register int i, j;
register long sum;
for (i = 0; i < 128; ++i) {
sum = 0L;
for (j = 7 - 1; j >= 0; --j)
if (i & (1 << j))
sum ^= POLY >> j;
CrcTable[i] = sum;
}
DEBUG("crcinit: done\n");
}
static long
hash(name, size)
register char *name;
register int size;
{
register long sum = 0L;
while (size--) {
sum = (sum >> 7) ^ CrcTable[(sum ^ (*name++)) & 0x7f];
}
DEBUG("hash: returns (%ld)\n", sum % INDEX_SIZE);
return(sum % INDEX_SIZE);
}
--
Jon Zeeff Branch Technology,
uunet!umix!b-tech!zeeff zeeff%b-tech.uucp at umix.cc.umich.edu
More information about the Comp.lang.c
mailing list