Refined C
Hank Dietz
hankd at pur-ee.UUCP
Fri Feb 26 06:28:00 AEST 1988
Several folk have requested that I post a brief description
of refined C... note that conformant arrays are a subset of
paramtypes.
Refined C (RC)
H. Dietz
Refined C is one of the fruits of my thesis work. To
begin with, I studied conventional C to determine which
language constructs unnecessarily blur flow analysis for
automatic parallelization. Among the worst of these are
interprocedural analysis problems in supporting separate
compilation, the inability to precisely name a portion of a
structure (pointers do this, but not in a way the compiler
can understand), and the need to override the type system in
solving common problems. Refined C - RC - is the result of
refining C so that pretty much the same things can be done
with pretty much the same syntax (less than 5% change), but
without confusing the compiler.
1. Language Overview
The following is a very brief description of how the
current definition of RC differs from ANSI C:
1.1. Access Right Specifications
Function prototypes are required in RC, and have a syn-
tax very similar to that used in ANSI C, however, one addi-
tionally specifies access rights being transmitted through
pointer arguments and access rights for global data. The
symbol "<" is used to indicate read-only, ">" means write-
then-read, "^" means modify, and "-" means no access. For
example:
{ < x; } int f( ^ char *p);
where x is a global variable, means that f is a function
which returns an int value, has modify access to characters
through its argument p, and has read-only access to the
value of the global variable x. Violation of specified
access rights is a fatal compile-time error. Further, no
function can call another function which has access rights
superior to those of the caller.
1.2. Partitioning
Partitioning allows one to refer to (and specify access
rights on) arbitrary non-overlapping pieces of data struc-
ture. Given a declaration like:
struct s { int a, b; } c, d[100];
it is obvious that c, for example, can be partitioned into
c.a and c.b; however, one can also partition d into d.a and
d.b, each of which acts much like an array of 100 int.
Further, it is possble to create new names for arbitrary
collections of elements from an array:
part(d[i], e, f(i), g);
would make e a name for the elements d[i] such that f(i) is
true, and would make g a name for remaining elements of d.
Partitioning need not imply making a copy or even evaluating
the separating conditions - it simply serves to inform the
compiler that accesses made through the name e are disjoint
from those made using the name g, and it provides a formula
for confirming this fact. Knowing that d[j] and d[k] are
provably non-overlapping data is key to good automatic
parallelization - by turning this into e[j] and g[k] parti-
tions efficiently provide such a proof. Of course, the par-
titions of a struct or array can be logically subdivided
further.
1.3. Paramtypes
A paramtype is a variable-size parameterized type, each
instance of which is tagged with the values of its parame-
ters (and these values can be examined as though they were
elements of a struct). For example, a variable-length
string type might be:
struct string(length) { char letters[length]; };
to allocate a string named s which could hold i chars, one
would simply write:
struct string(i) s;
One could then reference s.length and s.letters. Provided
that there is at most one variable-length field, the com-
piler would typically sort that to the last position in the
struct and addressing would work just like the current C
"hack" of declaring a struct and malloc-ing extra space
after it for the last field to grow into (e.g., declaring
letters[1]). The big difference is that paramtypes are
full-fledged types, hence you can have a function return a
paramtype, take sizeof a paramtype, etc., just as though it
were an ordinary struct. Paramtypes also permit multiple
parameters and multiple variable-sized fields, although
accesses to members of such paramtypes will typically be
slower because variables are involved in the address calcu-
lations.
Although, in retrospect, the type of a parameter should
also be specified in the declaration, the current definition
of RC assumes that they're all ints.
2. Status
A complete optimizing/parallelizing RC compiler has
been built, by K. Stein, but it is owned by the company
which funded it. In addition, we have a prototype tool
called CP which, using rather complex and slow analysis,
converts ordinary C code into reasonable (but not great) RC
code; CP is in the public domain. At Purdue, we are working
on another tool called CR, to interactively guide the pro-
grammer in improving RC code. There are also
definitions/research efforts in RF (refined FORTRAN), RP
(refined Pascal), RL (refined Lisp), and RA (refined Ada).
3. Annotated References
[1] H. Dietz and D. Klappholz, "Refined C: A Sequential
Language For Parallel Programming," 1985 International
Conference on Parallel Processing, August 1985, pp.
442-449.
Gives original (pre-ANSI) definition of RC.
[2] H. Dietz and D. Klappholz, "Refined FORTRAN: Another
Sequential Language for Parallel Programming," 1986
International Conference on Parallel Processing, August
1986, pp. 184-191.
Gives definition of refined FORTRAN (RF).
[3] H. Dietz, The Refined-Language Approach to Compiling
for Parallel Supercomputers, Ph.D. Thesis, Polytechnic
University, June 1987.
Most complete review of post-ANSI RC and concepts, as
well as compilation techniques. Also available as a
Purdue University internal report and soon (hopefully)
as a published monograph incorporating compiler imple-
mentation notes by K. Stein.
More information about the Comp.lang.c
mailing list