Efficient STRing CoMPares?
Chris Torek
torek at elf.ee.lbl.gov
Fri Mar 22 10:45:12 AEST 1991
[This is a correction to a previous article, which should have been
stamped out before you read it, because of the Supersedes header, but I
suspect there are news sites out there that do not implement
Supersedes. The correction is just an `='-for-`-' typo caused by
runaway fingers. Thanks to David Barnett for catching it early.]
In article <1151 at gistdev.gist.com> flint at gistdev.gist.com
(Flint Pellett) writes:
>... I'm curious if anyone has a method that actually speeds up strcmp()
>in the situtation where you have to get the lexicographic ordering
>information back.
For a particular application (my own personal version of Emacs, actually),
I found that comparing the first two characters `inline' gave the best
results.
/*
* Find an entry in a table. Create should be true iff we should create
* a new entry if the name is not in the table. The names of any new
* entries are saved with savestr().
*/
struct tentry *
FindEntry(t, name, create)
struct table *t;
register char *name;
int create;
{
register struct tentry **ents, *te;
register int i, h, m, l;
if (name == NULL)
return (NULL);
if (t->t_sorted <= 0)
SortEntries(t);
ents = t->t_ents;
/* binary search */
h = t->t_size - 1;
l = 0;
while (h >= l) {
te = ents[m = (h + l) >> 1];
/* first two characters done inline for speed */
if ((i = te->te_name[0] - name[0]) == 0 && name[0] != 0) {
if ((i = te->te_name[1] - name[1]) == 0)
i = strcmp(te->te_name + 1, name + 1);
}
if (i > 0)
h = m - 1;
else if (i < 0)
l = m + 1;
else /* found */
return (te);
}
if (create) {
if ((name = savestr(name)) != NULL &&
(te = insert(t, name, &ents[l])) != NULL)
return (te);
if (name != NULL)
free(name);
error("out of memory allocating entry in %s", t->t_name);
}
return (NULL);
}
SortEntries uses a clever trick: it does an insertion sort if the
table is `almost' sorted, otherwise it does a quicksort.
Much of this speed-hacking is due to *measured* results showing that my
Emacs spent most of its `startup' time building tables and reading the
current directory.
--
In-Real-Life: Chris Torek, Lawrence Berkeley Lab CSE/EE (+1 415 486 5427)
Berkeley, CA Domain: torek at ee.lbl.gov
More information about the Comp.lang.c
mailing list