integer types, sys calls, and stdio
John Bruner
jdb at mordor.UUCP
Thu Jan 31 06:48:54 AEST 1985
Things have quieted down quite a bit since I asked my initial question,
and I should be smart enough to leave things alone, but I guess I'm
not. We gave up and implemented sizeof(int) == sizeof(long) with
integers 36 bits wide, basically because we didn't want to have to
convert the overwhelming mass of existing programs. Snoopy raises
a point which I'd like to expand upon -- the idea of defining
derived types "int8", "int9", "int16" which can be redefined when
a program is moved from one machine to another. I had been doing
some thinking about writing programs for maximum portability
and how the language might be changed to encourage more portable
programs. Here are some of my thoughts on this issue.
By way of introduction, I am not a C novice. I learned C back
in 1977 on a PDP-11/70 V6 UNIX system. I have used it to
program on PDP-11's, VAXes, various Motorola 68K systems, and
now our local machine (the S-1 Mark IIA). The programs have
included user- and kernel-mode UNIX code, among other things.
Most C users are blessed with a machine architecture that
resembles a PDP-11 in several important ways: (1) it is a two's
complement machine, (2) it is byte-addressible (where larger
data types are some power-of-two number of bytes long), (3) it
has an 8-bit byte, (4) it operates most conveniently on primitive
data types which are 16 and 32 bits long, (4) it is not a tagged
architecture, (5) memory is not segmented, but is allocated in
one contiguous block (or perhaps two or three if you count
text/data-bss/stack). Another characteristic which rears its
head from time to time (although less often than the others,
thanks to the popularity of the MC68XXX) is (6) bytes are ordered
in a "little endian" fashion.
Writing truly portable code in C does not come naturally. As we
have discovered here in our efforts to port C and UNIX to the S-1,
a lot of programs break when the machine that they run on does
not satisfy one of the assumptions I noted above. For a mild
example, consider the byte-ordering problem and how it shows
up in programs such as "talk" (to name one example at random).
Here at the S-1 Project we have two operating systems projects
underway. The other operating system, Amber, is written in
a language called Pastel (a "colorful" Pascal). Pastel has
been significantly extended relative to standard Pascal, so that
it supports separate compilation (by "modules", each of which
may contain public and private parts), pointer manipulation,
flexible argument passing to procedures and functions (i.e.
varying number and type of arguments), good access to low-level
machine instructions (MUCH better than the kludgey "asm" in C),
and it produces excellent code.
>From time to time I have occasion to program in Pastel. While
I prefer C, and I often find the Pascal-based syntax a little
clumsy, I definitely miss a few of Pastel's features when I
program in C. (I'll come back to a specific example below.)
C is used to achieve two different ends. It is used to code
machine-dependent routines (e.g. device drivers), and it is used
to write machine-independent programs. Unfortunately, I fear
that too much of its machine-dependent flavor carries over
into programs that are supposed to be machine independent. The
assumptions that I listed above are continually invoked, so that
the resulting program won't go (at least, not easily) to another
machine. Anyone who has tried to port programs written for
the VAX (with implicit "int" == "long" assumptions) to machines
like the PDP-11 knows what I mean.
Having laid forth all of this philosophy, let me give one
specific case and expand upon Snoopy's suggestion. I believe
that C should provide some means of defining integer data types
in terms of the range of values that the type represents, rather
than the machine-dependent size of the storage cell that the type
will occupy. The compiler can pick the correct storage size.
Then "short" and "long" would be reserved for machine-dependent
cases, and machines with larger word sizes can be easily accomodated.
Why should the programmer have to worry about whether his value can
fit in a "short" or whether a "long" will be necessary? I'm not
familiar with Concurrent Euclid (perhaps I should look it up),
but subrange types are an important central concept in Pascal,
Modula, and Ada.
Please note that I am not proposing any new features for the
ANSI standardization effort. I'm expressing thoughts about
future directions for C. (I don't recall seeing subranges in
the C++ paper in the BSTJ [oops, BLTJ].) I'm not proposing
to turn C into Pascal. Contrary to some of the sentiments
expressed in this group, however, I do feel that C can benefit
from an examination of languages like Pascal.
Finally, let me hedge my way back toward the conservative camp
and pose a question that should be asked in parallel with
"what features does C need?" How can we raise the standards
of C programmers (possibly without *any* language changes) so
that the programs they write will be more portable? If we
don't have explicit subranges, how do we encourage programmers
to define and use things like "int8"? Other portability
considerations should include standardized derived types,
libraries, an understanding of pointers and integers (and why
(int)0 is not the same thing as (int *)0), and other implications
of the variety of machine architectures that C runs on.
[BTW, a VAX Pastel compiler is available through the ARPA/MILNET by
the anonymous account "ftp", file "pastel.bintape". This file is
in "tar" format. If you don't have ARPANET access, you can contact
Christine Ghinazzi,
S-1 Project, Lawrence Livermore National Laboratory
PO Box 5503
Livermore, CA 94550
for information on obtaining a tape copy. There is no charge.]
--
John Bruner (S-1 Project, Lawrence Livermore National Laboratory)
MILNET: jdb at mordor.ARPA [jdb at s1-c] (415) 422-0758
UUCP: ...!ucbvax!dual!mordor!jdb ...!decvax!decwrl!mordor!jdb
More information about the Comp.lang.c
mailing list