Soundex spelling corrector
Frederic W. Brehm
fwb at demon.siemens.com
Wed Jan 18 05:41:29 AEST 1989
The Soundex routine <19090 at agate.BERKELEY.EDU> posted by Dean Pentcheff
(dean at violet.berkeley.edu) can be made into a simple spelling corrector. I
did this to play around with his routine, and thought others might find it
amusing. There is probably a better method for making a spelling
corrector, but this was fun.
Copyright? Nah.
Step 1. Modify Dean's program to print the soundex code along with the
input string. Replace the printf statement with:
printf("%s %s", soundex(instring), instring);
Remember to remove the \n in the original format string!
Step 2. Compile soundex.c
cc -O -DTESTPROG soundex.c -o soundex
Step 3. [Optional] Compute and store soundex values for all words in
/usr/dict/words
soundex < /usr/dict/words | sort > words.soundex
Step 4. Cut out the following shell script. Make it executable, and have
fun.
#------------- cut here for mispel -----------------
#! /bin/sh
# A simple spelling corrector.
# Usage:
# mispel word
#
# All words from the system dictionary with the same soundex code as word
# are printed to the standard output.
DICT=/WHERE/THIS/LIVES/words.soundex
SOUNDEX=/WHERE/THAT/LIVES/soundex
# calculate the soundex value for the input word and put it in $1
set `echo $1 | $SOUNDEX`
# did you cache the soundex dictionary?
if [ -f $DICT ]; then
# look up the word in the cached dictionary
look -d $1 $DICT | awk '{ print $2 }'
else
# calculate the soundex value for the system dictionary and print
# all words which match the input word
$SOUNDEX < /usr/dict/words | fgrep $1 | awk '{ print $2 }'
fi
#------------- end of mispel -----------------
--
Frederic W. Brehm Siemens Corporate Research Princeton, NJ
fwb at demon.siemens.com -or- ...!princeton!siemens!demon!fwb
More information about the Alt.sources
mailing list