hash function for text in C
Ron Irvine
ron at sco.COM
Sat Oct 20 06:54:08 AEST 1990
I have used many hash functions. Each one has advantages
and disadvantages. Here is a powerful but simple one.
It simply does a crc16 on the character string. The crc16 is
a great hash function since it is designed to produce a
unique code for different character strings. So for "reasonable"
text it gives a good distribution of hash numbers (a big problem
to solve in general). The crc16 function I list below uses nibbles
to build the crc from each byte, this is faster than a bit by bit and
less room than a 256 entry table needed for a byte by byte (fits
in cache). So, it is fast (no multiply, just shifts and masks),
well behaved and simple. If you have a VAX you could even use the
"crc" instruction instead of the crc16 function - what a CISC.
To use, do a crc16 on the string and the truncate the 16 bit
result to the size you need (this should be a power of 2).
/* crc16 - crc16 routine
*
* R.K. Irvine
*
* This routine returns the crc16 for the string pointed
* to by "in".
* crc16 is given by: x^16 + x^15 + x^2 + 1
*/
unsigned short
crc16(register char *in) {
register unsigned int n, crc;
static unsigned short crc16l[] = {
0x0000,0xc0c1,0xc181,0x0140,
0xc301,0x03c0,0x0280,0xc241,
0xc601,0x06c0,0x0780,0xc741,
0x0500,0xc5c1,0xc481,0x0440,
};
static unsigned short crc16h[] = {
0x0000,0xcc01,0xd801,0x1400,
0xf001,0x3c00,0x2800,0xe401,
0xa001,0x6c00,0x7800,0xb401,
0x5000,0x9c01,0x8801,0x4400,
};
crc = 0;
while (n = (unsigned char)(*in++)) {
n ^= crc;
crc = crc16l[n&0x0f] ^ crc16h[(n>>4)&0x0f] ^ (crc>>8);
}
return(crc);
}
More information about the Comp.lang.c
mailing list