Storing substrings: some real C code in comp.lang.c!
David Hutchens
hutch%citron.cs.clemson.edu at hubcap.clemson.edu
Thu Oct 26 01:55:49 AEST 1989
>From article <1989Oct25.091519.10718 at twwells.com>, by bill at twwells.com (T. William Wells):
> I have been beating my head against the wall, looking for speed,
> *speed*, and MORE SPEED in this program I'm working on. At only
> 800 WPM (on my 16MHz '386) it is not exactly a speed demon.
> Especially since it has to be run on large lists of words, none of
> which are smaller than 80,000 words.
>
> Bill { uunet | novavax | ankh | sunvice } !twwells!bill
> bill at twwells.com
First, forget the tree, and the hash. What you need is a Finite State
Automaton. The basic idea for this can be found in the May 1988
issue of CACM in an article on an implementation of SCRABBLE(tm).
I've implemented a version of this and I can build the FSA in under
a minute (for about 50000 words) on a sun-3 and under 5 minutes on
a 80286. If possible, you should build the FSA for the words in
a separate processing step since it does take a bit of memory.
You'll need about a Meg.
The FSA, by its very nature allows you to get at prefix strings
very easily (that is quite useful in SCRABBLE).
David Hutchens
hutch at hubcap.clemson.edu
PS: Since you are doing something confidential, I assume it is
for *business*. I'll be glad to discuss selling you the right
to incorporate my program.
More information about the Comp.lang.c
mailing list