Conformant Arrays in C

Mon Feb 29 17:09:52 AEST 1988

In article <7715 at alice.UUCP>, ark at alice.UUCP writes:
> In article <694 at cresswell.quintus.UUCP>, ok at quintus.UUCP writes:
> > The things that Stroustrup
> > added to C are in themselves fine things, the trouble was that more things
> > should have been left out.
> 
> A challenge: name five things that ``should have been left out.''

The challenge is easily met.  But we have to nail down an ambiguity
in "should have been left out" first.

Given that Stroustrup *wanted* a high-powered language which was upwards
compatible with C, he didn't have any choices at all about what to leave
out:  *nothing* could be left out.

But the greatest virtue of C is that it doesn't get in your way much.
It doesn't have a lot of positive virtues.
This is no reflection on the designers of C.  C wasn't *supposed* to
be an early ADA, it was supposed to be a fairly minimal sort of tool.

So when I said that "more things should have been left out", the
question I was addressing is "what should a successor to C look like,
which tries to stay as comprehensible as C, but which tries to make it
easier to write correct programs?"  I repeat that this is not the
problem that Stroustrup was trying to solve, so my claim that some
things we'd be better off without is not a reflection on Stroustrup.

1.  Integer types as 'char', 'short', 'long' &c.
    This is KNOWN to be a cause of portability problems.
    It's fine when you are programming a particular machine, and want
    to be certain that things are the size you think they are.
    It's terrible for writing portable programs.

    These days, there is no good reason for a programming language to
    let the machine dictate the range of integers.  (The machine does
    dictate what range of integers is *efficient*, but that's another
    question.)  SUN didn't let the fact that MC68010s have no 32-bit
    multiply instruction dictate to them that int==short, so why should
    the absence of 48-bit integer instructions mean that I can't have
    a variable capable of holding 48-bit integers?  Yes, C++ will let
    me define such a data-type, and will let me define operations on
    it.  But it's my job, and it should be the compiler's.

2.  Identification of integer and boolean types.
    This is a case of over-specification.  On some machines, it might
    be more efficient to use (< 0)=true,(>=0)=false.  It is also a
    cause of confusion.  Suppose that integer and boolean were
    distinct types.  Then
	if (i = 0) ...
    would be a type error.  Similarly, the quiet conversion of
    pointers and floating-point numbers to boolean (in each case,
    equality to 0 is falsehood and difference from 0 is truth)
    is something we'd be better off without.

3.  Implicit 'int'.
    "register i;" is a legal C declaration, hence legal in C++.

    Things used as functions are implicitly "int foo(...);".

    In fact, to keep the number down, let's include old-fashioned
    function declaration syntax.

4.  Untagged unions.

    Not only does C++ have C's unions, it goes out of its way to make
    it easier to trip over your own feet.  I refer to anonymous unions.

    I could write a hymn of praise to tagged unions, discrimination
    case statements, and polymorphic types, and an execration text
    for untagged unions, but let's keep it down.

5.  Identification of 'endcase' and 'break'.

    Possibly the most frequent reason for the presence of a label in
    my C code is that C uses the same symbol for "finish case" and
    "finish loop".  This is particularly bewildering, because BCPL
    had different symbols for the two ('endcase' and 'break'), and
    it didn't make the BCPL compiler any more complicated.

    In fact, let's broaden this to the switch statement as a whole.
    A statement like

	switch (i) { int i; { int i; case 1: ...; }
			if (...) case 2: ...;
			    else case 3: ...;
		   }

    is in fact a legal C statement, and from my reading of the C++
    book it is a legal C++ statement.  (The book explicitly says
    that the controlled statement doesn't have to be compound, and
    *may* contain declarations.)

6.  C-style arrays.

    I can go into more detail if anyone's interested.
    The problem is that the size of an array is part of its type,
    sort of.  It's all very confusing, really.

SUMMARY.
    Each of the things I have mentioned is an obstacle to the
    writing of reliable and portable programs.  Number 6 in particular
    is responsible for the large number of UNIX utilities which die
    horribly in the input data exceed some undocumented limit (and
    the rather wierd limits that *are* documented, see xargs(1)).
    Each of them is something that Stroustrup had to include in C++
    *if* it was to be upwards compatible with C.  Which is why I
    want to use "D", not "C++".