Non-word "accreditate" in /usr/dict/words
Geoff Kuenning
geoff at desint.UUCP
Fri Mar 25 17:07:44 AEST 1988
In article <7481 at brl-smoke.ARPA> gwyn at brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
> In article <1697 at desint.UUCP> geoff at desint.UUCP (Geoff Kuenning) writes:
> -This is because of the optimistic design of spell(1). Spell has a list of
> -suffix rules, which it applies to all words indiscriminately. A suffix
> -that only makes sense on a verb (e.g., -ment) will be applied to nouns,
> -adverbs, and adjectives as well. Thus, for example, spell accepts
> -"sincerement" as well as "sincerly" (I just checked).
>
> Spell (at least the System V version) has a "stop list" that can be
> tweaked to catch common errors such as "sincerly" that slip through
> the net. Not great, but it works.
The stop list dates back at least to V7 spell. Unfortunately, it's not
a solution to this problem. The difficulty is that there are far more
wrong words than right ones. /usr/dict/words lists only "sincere", but
spell will accept "sincerly", "sincerement", "sincereness", "sincered",
"sinceres", and "sincereless" (again, I checked). I admit that some of
these (notably -ment and -less) are not likely typos. However, the "-d"
and "-s" forms are one-keystroke errors, and the "-ness" form can easily
be generated by a person who has momentarily forgotten the word "sincerity".
(Which worries me, BTW: there aren't many words ending in "e" that can legally
have -ity added to them, but spell takes "sincerity"...)
The end result is that, if every possible typo were placed in the stop
list (a nontrivial task), the hashing scheme used would probably begin to
break down (though I haven't calculated the probability of this).
--
Geoff Kuenning geoff at ITcorp.com {uunet,trwrb}!desint!geoff
More information about the Comp.bugs.sys5
mailing list