soundex algorithm wanted
BALDWIN
mike at whuxl.UUCP
Thu Sep 4 08:06:54 AEST 1986
> > I would like any info pertaining to soundex search algorithms
> > (phonetic grep). Source to a nifty, efficient algorithm would
> > be great, but I'll take anything. Thanx in advance.
> >
>
> /*********************************************************\
> * This program exemplifies the soundex algorithm. *
> * *
> * You type in a word and it spits out the soundex string *
> * that was produced for that word. *
> \*********************************************************/
Unfortunately, it doesn't generate correct Soundex codes.
The algorithm is actually pretty tricky, and I've seen
lots that don't handle names like Lloyd and Manning
properly. Here's one that I believe is correct:
-----
#include <ctype.h>
#define SDXLEN 4
char *
soundex(name)
char *name;
{
static char buf[SDXLEN+1];
register char c, lc, prev = '0';
register int i;
strcpy(buf, "a000");
for (i = 0; *name && i < SDXLEN; name++)
if (isalpha(*name)) {
lc = tolower(*name);
c = "01230120022455012623010202" [lc-'a'];
if (i == 0 || (c != '0' && c != prev)) {
buf[i] = i ? c : lc;
i++;
}
prev = c;
}
return buf;
}
-----
And a little driver for it:
-----
main()
{
char line[64];
while (gets(line))
puts(soundex(line));
return 0;
}
--
Michael Baldwin
(not the opinions of) AT&T Bell Laboratories
{at&t}!whuxl!mike
More information about the Comp.sources.unix
mailing list