Signed char - What Foolishness Is This!
guy at sun.UUCP
guy at sun.UUCP
Sat Oct 18 05:37:49 AEST 1986
> 1) Do other C compilers make 'char' a signed quantity by default?
Yes. Lots and lots of them, including the very first C compiler ever
written (if there was an earlier one, Dennis, let me know...) - the PDP-11 C
compiler.
> 2) What possible justification is there for this default?
1) When the PDP-11 C compiler was written, ASCII characters *were* 7-bit
characters, and there was no general use of 8-bit characters, and 2) the
PDP-11 treated bytes as signed, rather than unsigned, so references to ASCII
characters as unsigned rather than signed costs some time and bought you
nothing. I suspect Microsoft did this to make less-than-portable code
written for PDP-11s and VAXes work on 8086-family machines without change.
> Is not 'char' primarily a logical (as opposed to mathematical) quantity?
Yes, but the people to complain to here are ultimately the designers of the
PDP-11 (although a lot of string manipulation on PDP-11s could be done using
unsigned characters without much penalty).
> I can understand the desirability of allowing 'signed char' for gonzo
> programmers who won't use 'short',
It's not a question of "gonzo programmers who won't use 'short'. There are
times where you absolutely *must* have a one-byte number in a structure;
"short" just won't cut it here. (Bit fields would, perhaps, except that you
can't take the address of a bit field.) Structures representing device
registers, or representing fields in other externally-specified data, are an
example of this. Also, if you have a *huge* array of integers in the range
-127 to 128, you may take a significant performance hit by using "short"
rather than "char" (remember, "short" takes twice the amount of memory that
"char" does on most implementations).
> or who want to risk future compatibility of their code on the bet that
> useful characters will always remain 7-bit entities.
They're risking nothing. "signed char" is a gross way of saying "short
short int", not a way of saying "signed character" (which, as you say, is
meaningless). Unfortunately, C originally didn't have "short" or "long",
and when they were added they did not cascade.
I presume, by the way, that "isupper(<u-umlaut>)" is intended to return 0
and "isupper(<U-umlaut>)" is intended to return 1. If Microsoft didn't put
the extended character set into the "ctype" tables, the way that the
indexing is done is irrelevant.
--
Guy Harris
{ihnp4, decvax, seismo, decwrl, ...}!sun!guy
guy at sun.com (or guy at sun.arpa)
More information about the Comp.lang.c
mailing list