Conformant Arrays in C
Richard A. O'Keefe
ok at quintus.UUCP
Mon Feb 29 17:09:52 AEST 1988
In article <7715 at alice.UUCP>, ark at alice.UUCP writes:
> In article <694 at cresswell.quintus.UUCP>, ok at quintus.UUCP writes:
> > The things that Stroustrup
> > added to C are in themselves fine things, the trouble was that more things
> > should have been left out.
>
> A challenge: name five things that ``should have been left out.''
The challenge is easily met. But we have to nail down an ambiguity
in "should have been left out" first.
Given that Stroustrup *wanted* a high-powered language which was upwards
compatible with C, he didn't have any choices at all about what to leave
out: *nothing* could be left out.
But the greatest virtue of C is that it doesn't get in your way much.
It doesn't have a lot of positive virtues.
This is no reflection on the designers of C. C wasn't *supposed* to
be an early ADA, it was supposed to be a fairly minimal sort of tool.
So when I said that "more things should have been left out", the
question I was addressing is "what should a successor to C look like,
which tries to stay as comprehensible as C, but which tries to make it
easier to write correct programs?" I repeat that this is not the
problem that Stroustrup was trying to solve, so my claim that some
things we'd be better off without is not a reflection on Stroustrup.
1. Integer types as 'char', 'short', 'long' &c.
This is KNOWN to be a cause of portability problems.
It's fine when you are programming a particular machine, and want
to be certain that things are the size you think they are.
It's terrible for writing portable programs.
These days, there is no good reason for a programming language to
let the machine dictate the range of integers. (The machine does
dictate what range of integers is *efficient*, but that's another
question.) SUN didn't let the fact that MC68010s have no 32-bit
multiply instruction dictate to them that int==short, so why should
the absence of 48-bit integer instructions mean that I can't have
a variable capable of holding 48-bit integers? Yes, C++ will let
me define such a data-type, and will let me define operations on
it. But it's my job, and it should be the compiler's.
2. Identification of integer and boolean types.
This is a case of over-specification. On some machines, it might
be more efficient to use (< 0)=true,(>=0)=false. It is also a
cause of confusion. Suppose that integer and boolean were
distinct types. Then
if (i = 0) ...
would be a type error. Similarly, the quiet conversion of
pointers and floating-point numbers to boolean (in each case,
equality to 0 is falsehood and difference from 0 is truth)
is something we'd be better off without.
3. Implicit 'int'.
"register i;" is a legal C declaration, hence legal in C++.
Things used as functions are implicitly "int foo(...);".
In fact, to keep the number down, let's include old-fashioned
function declaration syntax.
4. Untagged unions.
Not only does C++ have C's unions, it goes out of its way to make
it easier to trip over your own feet. I refer to anonymous unions.
I could write a hymn of praise to tagged unions, discrimination
case statements, and polymorphic types, and an execration text
for untagged unions, but let's keep it down.
5. Identification of 'endcase' and 'break'.
Possibly the most frequent reason for the presence of a label in
my C code is that C uses the same symbol for "finish case" and
"finish loop". This is particularly bewildering, because BCPL
had different symbols for the two ('endcase' and 'break'), and
it didn't make the BCPL compiler any more complicated.
In fact, let's broaden this to the switch statement as a whole.
A statement like
switch (i) { int i; { int i; case 1: ...; }
if (...) case 2: ...;
else case 3: ...;
}
is in fact a legal C statement, and from my reading of the C++
book it is a legal C++ statement. (The book explicitly says
that the controlled statement doesn't have to be compound, and
*may* contain declarations.)
6. C-style arrays.
I can go into more detail if anyone's interested.
The problem is that the size of an array is part of its type,
sort of. It's all very confusing, really.
SUMMARY.
Each of the things I have mentioned is an obstacle to the
writing of reliable and portable programs. Number 6 in particular
is responsible for the large number of UNIX utilities which die
horribly in the input data exceed some undocumented limit (and
the rather wierd limits that *are* documented, see xargs(1)).
Each of them is something that Stroustrup had to include in C++
*if* it was to be upwards compatible with C. Which is why I
want to use "D", not "C++".
More information about the Comp.lang.c
mailing list