towards a faster isdigit()
Tom Wicklund
wicklund at intellistor.com
Thu May 9 01:44:53 AEST 1991
In <1991May8.030515.7004 at twinsun.com> eggert at twinsun.com (Paul Eggert) writes:
>The traditional implementation of isdigit() in <ctype.h> is typically
>something like this:
> #define isdigit(c) ((_ctype_+1)[c] & 4)
>which requires indexing through a global array followed by a masking
>operation. Why not use the following implementation instead?
> #define isdigit(c) ((unsigned)((c)-'0') < 10)
>This needs just a subtraction followed by a comparison. It's faster on
>all the systems I've tried it on, and is strictly conforming ANSI C.
Two portability problems --
1) Numeric digits aren't adjacent character codes in all character
sets (though I can't think of an example where this isn't true off
hand).
2) The (unsigned) cast may not work (I think an implementation could
return abs(x) as the result of an (unsigned) cast but am not sure),
so (unsigned)(c-'0') would be TRUE for "c" in the range '0'-9 .. '0'+9.
Note the first "traditional" implementation isn't completely portable
either since it assumes that EOF is -1, not necessarily true.
Two possible performance problems:
1) Your example is probably faster in an -if- statement, but slower
in a statement such as:
x = isdigit(y);
2) On machines with complex addressing modes (e.g. 80x86 and 68xxx)
it should be possible to compile the table driven isdigit into a
single instruction which doesn't use an extra register. Your example
is a single instruction but requires a register to hold (c - '0'),
which could mean other code won't optimize as well.
I think the "traditional" implementation is done to be consistent with
macros such as isalpha, which is much more efficient as a table (and
letters aren't adjacent codes in EBCDIC or foreign language ISO/ASCII
based character sets).
So both approaches should work on most modern machines and compilers,
so there's nothing wrong with implementing either.
More information about the Comp.lang.c
mailing list