C Community's Cavalier Attitude On Software Reliability
Richard O'keefe
ok at goanna.oz.au
Mon Feb 26 17:44:57 AEST 1990
I'm fond of saying "the more I like C, the better I like Ada",
but I really couldn't let Bill Wolfe's message pass unchallenged.
In article <8147 at hubcap.clemson.edu>, wtwolfe at hubcap.clemson.edu (Bill Wolfe) writes:
> Following are some prime examples of why the C community is thought
> of by many as having an unprofessional and irresponsible attitude
> toward software reliability:
> DAB(!1) There is a mysterious bug causing occasional core dumps...
> ...just send mail to the author.
What is dab? How do you know it is written in C?
> FILE(1) It often makes mistakes.
'file' is a UNIX command which tries to guess what kind of information
a file contains. The UNIX file system does not have store 'type'
information with files, so it CAN only guess. (One file system I have
used tagged files as PascalSymbol, AplWorkSpace, AlgolCode, and so on,
for over 200 different types. The Macintosh file system does much the
same. Unix _doesn't_.) Current versions of 'file' are controlled by
an /etc/magic file, but you can't tell them about new programming
languages (if anyone has a version of 'file' which looks for keywords
in /usr/lib/vgrindefs, let me know). But the classification task is
in principle impossible to get precisely right: is
(+ 1 1)
(a) Lisp, (b) Scheme, (c) data for a Lisp or Scheme program,
(d) something else?
This has _nothing_ to do with the language that 'file' is written in.
> JOBS(3J) There is no excuse for the "getwd" routine to be in the
> jobs library. There is even less reason for this routine
> not to have been documented by the author(s) at Berkeley.
That was a temporary hack in 4.1BSD. There is no jobs library in 4.2BSD
or 4.3BSD and getwd() _is_ documented. Again, where is the evidence that
this has something to do with C? (I've seen one OS written in Pascal and
another in Fortran, and have _no_ reason to think them better.)
> PARSEDATE(3) The grammar is incomplete and always will be.
This is another problem which is unsolvable in principle. What is
provided is a useful approximation. The incompleteness has nothing
to do with C as such. (The fact that parsedate was distributed free
over the net _does_ have something to do with C.)
> PUTC(3) Errors can occur long after the call to 'putc'.
This _is_ a property of the programming language C, now that the
stdio library is a standardised part of the language. There are
two parts to the problem:
-- most implementations of putc() don't check that the file is
in output mode at the time of call; _that_ I cannot excuse
because it is easy to implement stdio so that it _does_
check then and there (the trick is to use *two* counters:
"how much input left in buffer" and "how much space left
for output in buffer").
-- even if the file is in output mode, _physical_ I/O errors
are not reported immediately.
It is worth noting that the second point ALSO applies to UNIX
implementations of Ada. It must, because it is a UNIX problem, not
a C problem. In fact, because of the way UNIX buffers disc i/o,
a physical error may not be detected until after the file is CLOSED,
which is a right pain. Some recent UNIX systems do provide a system
call which can be used to prevent the problem; I am told that it is
costly. I imagine that this problem does not occur under MS-DOS.
> SCANF(3S) The success of literal matches and suppressed
> assignments is not directly determinable.
Scanf() is one of my pet hates. It is surprisingly slow as well.
> SIN(3M) The value of 'tan' for arguments greater than about 2**31
> is garbage.
I am indebted to Bill Wolfe for pointing this one out; I usually go by
the SVID (which is quite explicit that for partial loss of significance
as much as possible of the result must be returned and that for total
loss of significance a message is printed and 0 returned; in either case
errno is set to ERANGE), or by a draft of the ANSI C standard (which
does not license anything like this), or by SunOS documentation.
The Encore manual page does contain this message, which apparently pertains
to some 4.2BSD implementations, and it had never occurred to me that it
might.
It's worth pointing out that there is nothing in Ada to prohibit an
Ada implementation doing this too: tan() is _not_ an official part
of Ada. It would be worth checking what Ada systems do on 4.2BSD...
Yes, I know about the Ada Numerics Working Group. I am really
enthusiastic about their definition of the trig functions, and that
definitely really does give you something you can _trust_. I would
be _very_ happy for that to be cited in an appendix to the C
rationale as a worthy guide for C implementors (too late to affect
the standard).
By the way, is the version of tan() in question written in C, or is
it written in assembler? It makes a difference to the argument!
> CTAGS(1) ...if you have two Pascal procedures in different blocks
> with the same name, you lose.
The whole idea behind a 'tags' file is that there is only one occurrence
of any given name as a routine name in the given cluster of files. This
specific problem is not a problem with 'ctags' as such, but a consequence
of the fact that the data structure it is required to construct simply
cannot express this situation.
There _are_ serious problems with ctags. In particular, because it
operates on the raw text instead of preprocessing and parsing it, the
following example
#if 0
#define foo(x,y) (((x)+(y))/2)
#else
double foo(x, y)
...
#endif
will be misunderstood: the first occurrence of 'foo' will be mistaken for
the definition and the real definition will be reported as an error.
Another problem is that 'ctags' and 'file' use different heuristics for
guessing the language of a file.
The fundamental problem is that 'ctags' is trying to do something which
is not in principle doable, thanks to the existence of the preprocessor.
> EMACS(1) Not bloody likely.
This tells us nothing about C. (Particularly when you consider that
Emacs variants have been written in Teco, Lisp, PL/I, Pop, and C, to
name but a few.)
> TC(1) The aspect ratio option is unbelievable.
What _is_ tc? We haven't it here, and it's not in the SVID.
How do you know it is in C?
> UNITS(1) Don't base your financial plans on the currency conversions.
This has nothing to do with C. It's a straightforward consequence of the
fact that the 'money' part of /usr/lib/units was fixed at 16-June-1980.
The currency conversions are just fine for that day. Since the 'units'
program is driven by a readable source file, if you want current figures,
plug 'em in.
The issue here is not a LANGUAGE issue but a DESIGN issue.
Shall we
a) not provide any currency conversion information at all?
b) provide currency conversion correct at a particular date,
but with no _automatic_ means of updating it?
c) provide currency conversion updated daily? From what source?
Exactly the same choices would confront an Ada user, and exactly
the same considerations (how can we get at currency figures _portably_)
might lead to the same choice.
(Since the introduction of the 'news' software; it would make sense to
have a comp.currency newsgroup with daily updates...)
> BBEMACS(1) I tinker too much, so occasionally I mess up and it
> don't work no good. But then I fix it, hooray.
This is pretty shocking, but what has it got to do with C?
The very *worst* publicly available software I have ever seen was
written in Pascal.
> When is the C community going to clean up its act???
What C community? Wolfe cited some utilities & libraries from 4.2BSD,
does he mean Berkeley? He cited some programs I've never heard of
(dab, tc, bbemacs), does he mean their authors? He cited some functions
from the C library, most of which HAVE been cleaned up.
The really important thing is that Wolfe has failed to show (and didn't
even _try_ to show) that there was any causal connection between the
defects of these programs and the use of C. Remember Sturgeon's Law:
90% of _everything_ is crud.
I repeat, the more I use C, the better I like Ada. (And Eiffel.)
(No, _especially_ Eiffel.) But Wolfe needs a better argument.
> It appears that there is a real need to publicize software engineering
> concepts throughout the C community, both directly through software
> engineering education
I agree 100% with this, except that I'm not convinced that there IS a
single "C community" whose act needs to be cleaned up.
What we really want is a large number of low-cost (< US$300, say)
Ada compilers scattered around enough so that people start contributing
Ada sources to the net. Then we'll get a chance to see how much of a
difference the language makes. In the mean time, we have to make do
with what we've got.
More information about the Comp.lang.c
mailing list