Selecting a Prog-Lang: Support for C
Tom Roberts
tjr at ihnet.UUCP
Sat May 12 04:51:44 AEST 1984
[WARNING: this article is 246 lines long]
Thanks to those of you who have sent me suggestions why YOU choose
to program in C. Here are my current thoughts on the subject.
This is intended to provide some "intellectual ammunition" to those
unfortunate desciples of C who must justify its use to their management
(or other Higher Authority). Its length is due to the inherent
complexity of the problem of selecting a Programming Language.
I have NOT attempted to include ANY economic justifications, because
they depend strongly upon your organization's history (I am thinking
of such things as: "It would cost N staff-months to re-train our staff
to use Programming-Language X"). Ultimately, such arguments are the
strongest ones to use to convince management; they are also the most
difficult to quantify.
Please send all flames and "ad hominem" attacks to /dev/null; quibbles
about details are best left to mail; reasonable (and reasoned) discussions
about issues raised herein are welcome.
INTRODUCTION:
Here are a few random notes on the choice of a programming-language
for a medium-to-large software project. I assume that the project
will include computers from several manufacturers, ranging from micros
to mainframes, so that only reasonably portable high-level languages
need be considered (FORTRAN, COBOL, PL/I, PASCAL, ADA, and C).
I assume the project is primarily data-management, does not contain
large amounts of floating-point computations, and will involve several
(many) programmers. I consider such languages as LISP and SNOBOL to be
too limited in scope; I consider MODULA-2 to be too new and un-developed
(but interesting....).
CAVEATS:
My personal bias shows: I have written ~50000 lines of FORTRAN
(Elementary Particle Physics), ~250 lines PASCAL (curiosity),
~12000 lines COBOL (data-base system), ~15000 lines C (OS, and
misc applications), ~15000 lines assembler (Z80, CDC-6500, HP2100,
PDP-10, PDP-15, IBM-404 (3-plugboards!)). I have read several
texts on ADA, but have not programmed in it. At present,
I program in C whenever permissible, because it has the
best blend of features and efficiency I desire. I know next to
nothing about PL/I.
ALL OPINIONS EXPRESSED ARE MY OWN.
PORTABILITY:
FORTRAN, PASCAL, and COBOL have a central, portable base;
this often is NOT sufficient to write complete systems
(e.g. the lack of portability in their data-files).
PASCAL is un-usable without some (non-portable) extensions.
ADA is intended to be portable, and complete; this has yet
to be demonstrated. C has a de-facto standard (K&R), and
is portable as a language; the library functions are not always
portable; operation under UNIX is completely portable, if the
original code is written with portability in mind (the same can
almost be said for the other languages - this is more a statement
about UNIX than the language).
PRODUCTIVITY:
FORTRAN and COBOL rate "low" on productivity, because of their lack
of structured-programming features, and their lack of compile-time
constructs (#define/#include/#if); pre-processors (e.g. RATFOR)
can mollify this. PASCAL rates "high", except for
hardware-dependent programs, strings, and file I/O, where it
has problems. C rates "high", except possibly for large projects, where
poor readability can be a drawback; this is highly dependent upon
the skills and experience of the programming staff. ADA has more
features and complexity than C, and so can be less readable;
it can overload operators and functions, which could make it
either less or more readable; the readability of ADA programs
will probably depend even more upon the skills and experience
of the staff.
PERFORMANCE:
COBOL tends to be slow in execution, mainly due to the type of
applications using it; implementations on small machines (e.g.
Z80) are usually interpreters (VERY slow). FORTRAN is OK, but
the lack of pointers means that data-structures are often implemented
as two-dimensional arrays, with a multiplication for each
reference - this can be intolerable on hardware without
a multiply instruction. PASCAL is OK, if it is compiled;
many implementations are tokenized interpreters (P-code),
with poorer performance; its array-bounds checking (part of the
language) can reduce performance significantly. C has none of these
drawbacks, and can be partially "hand-optimized" by declaring some
variables as "registers"; most C compilers do not optimize code as
well as might be wished. C models the instruction-sets of many computers
very well (especially the modern microprocessor chips like the MC68000,
WE32000, etc.); on such CPUs, C can approach assembly-language
in speed and efficiency. ADA has not provided much experience;
initial guesses are that "simple" things will be OK, "complicated"
things (e.g. tasks) might not.
PROGRAM COMPLEXITY:
PASCAL and ADA, with their strong-typing, can cause complexity
to increase (e.g. dynamic-memory allocation).
FORTRAN causes complexity because it lacks structures
(this is the MAJOR reason to avoid FORTRAN); call-by-name can
interact with COMMON in un-expected (and non-portable) ways;
dynamic allocation is very difficult, and usually involves
non-portable operations (e.g. referencing arrays out-of-bounds).
The lack of recursive functions (FORTRAN, COBOL) can seriously
complicate inherently recursive algorithms.
COBOL relies heavily upon global data, making scoping
virtually impossible; its restricted set of statements makes
even simple programs LOOK complex; dynamic allocation is
virtually impossible. C makes it impossible to do single-precision
floating-point arithmetic (double-precision is used); complex
arithmetic is not defined (must be implemented as structures and
functions). C (and to some extent, ADA) can perform low-level
operations (e.g. I/O drivers in UNIX are routinely written in C);
this can greatly improve productivity, complexity, and readability
when such operations must be performed.
SOFTWARE ENGINEERING:
FORTRAN, PASCAL, and COBOL offer little or no help; pre-processing
is essential. C contains its own pre-processor with the most-used
features (#include/#define/#if). PASCAL does not specify separate
compilation - a VERY big drawback. C is specified by a grammar,
which can GREATLY ease the construction of sophisticated language-
processing tools; it also improves the performance of the compiler
(during development, most systems spend more resources compiling
than executing). C can provide basic control of symbol location
(e.g. RAM or ROM), which can simplify symbol-management, and
permits writing ROM-able code (which is inherently non-portable).
ADA is so large that I suspect it will be VERY slow during
program builds; its support environment is also complicated (and,
I suspect, inefficient) - much of this is dictated by the
large-scale systems it is intended to support.
TOOLS:
FORTRAN and COBOL have many existing tools, of varying quality and
portability (many have strong OS dependencies); UNIX tools can be
of reasonable utility. PASCAL has several integrated programming
environments, most of which are reasonably portable. ADA has a
complete, portable environment defined (but un-implemented at
present). C uses UNIX tools quite well, and has some special
language-processing tools (e.g. lex and yacc).
Several major source-handling systems are specifically C-language.
RELIABILITY/MAINTAINABILITY:
This seems to depend more upon the skills and experience of the
programming and design staff, than upon the choice of language.
The more intricate languages (C and ADA) can contain more subtle
"hidden" errors, simply by virtue of their richer syntax; however,
they can also result in shorter (i.e. fewer NCSL) programs.
The lack of structures in FORTRAN is a serious drawback, because
it can make programs un-readable; ditto for use of EQUIVALENCE.
COBOL tends to be so readable that important items are obscured
by the incredible amount of extraneous text (i.e. the "Purloined
Letter" syndrome).
ADDITIONAL COMMENTS:
ADA:
ADA is a new language, with no existing programmer base; it will
take some time to become experienced in ADA programming,
software engineering, and management. It LOOKS very promising,
but it has looked so for so long that I worry that it is really
too complicated, and too difficult to implement. I shy away from
its byzantine complexity - top-notch programmers will have no
serious trouble, but I suspect that "average" programmers will
NEVER come to grips with all of its features/idiosyncracies (and
they're the ones who will maintain the code).
ADA has tried to do EVERYTHING (numerical analysis, real-time control,
scientific computation, data-base, concurrent programming,
Operating Systems, etc.); and each application-programmer has
to learn the special features designed for everyone else. Much will
probably be written in ADA, but I doubt that many programmers
will voluntarily choose it (their managers will choose it for them).
Training programmers to use ADA will surely require more time and
effort than any of these other languages, I am not yet convinced that
the savings (mainly software engineering issues) will offset this.
If ADA truly becomes a universal, portable language, with a portable
environment to support it, it will probably (and justifiably)
displace the other languages (you CAN program in a tractable
subset...); don't hold your breath.
Strong Typing:
Strong typing is the attribute of a language that assigns a
specific "type" to each entity (variable) in a program, and then
prohibits the mixing of different types. Of the languages discussed,
PASCAL and ADA are strongly-typed, the others are not (FORTRAN,
COBOL and C are "weakly" typed in that some mixtures are legal,
others are not; some type-conversions are automatically supplied).
The advantage is that some programming errors can easily be detected,
because mixing types is often illogical or nonsensical. The
disadvantage is that when you really need to mix types, the
compiler gets in the way, forcing you to do something special
(sometimes un-obvious and non-portable). ADA and C have
(portable) mechanisms to subvert the typing restrictions.
Portability:
Portability is the attribute of a language that allows a program
written in it to be run, without change, on several (many)
computers. ADA is inherently portable, and C is nearly so;
FORTRAN, COBOL, and PASCAL are NOT portable (they were all intended
to be so, but the implementations fall far below the intentions);
in practice they can sometimes be portable enough.
I feel that portability is VERY important, because the time-scale
of a software system is typically long compared to the time-scale of
current hardware advances, and because many applications inherently
require several different types of computers to cooperate together
and act as one system.
SUMMARY:
COBOL is un-suitable because of its poor portability,
its restricted set of operations, and its lack of efficiency in most
implementations. Besides, it is just plain UGLY.
FORTRAN is un-suitable because of its lack of data-structures,
lack of pointers (and dynamic allocation), and poor portability.
PASCAL is un-suitable because separate compilation is not specified,
because its (necessary) extensions are not portable, and because
of the added complexity added by its strong typing.
ADA is not suitable because it doesn't exist as a useful language
on a sufficiently-large number of machines.
C has only minor drawbacks compared to the other languages considered.
CAVEATS (revisited):
All opinions expressed are my own. Remember that I have virtually
no experience in PASCAL, and none at all in ADA or PL/I.
Tom Roberts
ihnp4!ihnet!tjr
More information about the Comp.lang.c
mailing list