NOT Educating FORTRAN programmers to use C

Wed Jan 10 14:52:56 AEST 1990

>From article <646 at chem.ucsd.EDU>, by tps at chem.ucsd.edu (Tom Stockfisch):
> [...]                       Apart from Cray work, I have found I can write
> things to run faster or at least the same speed in C.

Then, whoever wrote the Fortran compiler on those machines was either
incompetent or (more likely) he foolishly used the same 'backend' as
the C compiler.  C is inherently slower because extensive pointer use
inhibits optimizations.  The optimizations most affected are pipelining
and vectorization (so now you see why Fortran still beats C on the Cray).
most C compilers don't even bother trying certain optimizations because
the compiler writer felt the time would be wasted since the presence
of pointers would usually inhibit the optimization anyway.

Note: one other small point - at least in principle, Fortran should
be faster at character manipulation than C.  This is because Fortran
character type can keep explicit lengths of all strings and, therefore,
omit prescanning.  A C user can go out of the way to keep string
lengths too - but usually doesn't.  The only built-in string feature
in the C language is the string constant - which uses the less efficient
null termination method for string lengths.

Aside from the above two issues Fortran and C are identical with
respect to optimization.  A C program which uses no pointers should
(at least in principle) not suffer any inhibitions to optimizations.
How common such C programs are (and how common such a style might
become) I leave to your judgement.

> [...]
>>Most of your definition of 'crap' in Fortran is the style of the programmer.
> 
> But there is no way to write a loop in Fortran without using statement labels.
> And misspelled variable references result in definitions of new variables
> initiallized to zero.

Yes, and the same sort of errors (a slip of the finger while typing) can
cause serious problems in C as well.  Besides, most Fortran compilers
now have an IMPLICIT NONE statement which will cause the compiler to
require declaration of all variables and procedures used in the body
of the program (we are still lobbying X3J3 to require IMPLICIT NONE
in Fortran 8x).

As for loops without labels, Fortran 8x _will_ have a complete complement
of label-less flow control constructs.  I doubt that any future C will
have pointerless array manipulation though (that's one of the differences
between the two languages - one is striving to improve, the other thinks
it's already perfect).

> [...]
>>I can generate 4x that type of 'crap' in C.
> 
> I agree.  Getting an old f77 type to switch to C must be accompanied by
> teaching them how to write reasonable code.

If reasonable coding is a prerequisite for using C, then most C users
haven't met the grade yet.  Actually, neither language has a monopoly
on bad programmers - nor on good ones!  Poor Fortran programmers tend
to use GOTO too liberally and in inappropriate ways.  Poor C programmers
tend to do the same thing with pointers (the GOTO of data structuring).
Fortran programmers tend to write monolithic programs, C programmers
tend to fragment theirs.  -- Etc..

> [...]
>>It might enlighten you to read a book by Kernighan (THE K in K&R) and Plauger
>>called "The Elements of Programming Style".
> 
> Isn't this the book that preaches to make it right before you make it fast?

And quite right too!  What's the use of getting the wrong answer at
lightning speed?  Debug the code _first_, then make it fast - since
optimizations often obscure the structure of the program.

> [...]
> Also, this ignores the importance of memory complexity, where malloc()
> helps out enormously.  [...]

I don't see how malloc helps in complexity.  Dynamic memory is _always_
more complex than static memory.  The only way dynamic memory is useful
is for objects whose size _can't_ be known at compile time or for large
items which aren't in use simultaneously (and you want to keep your 
program size small.  Either way, you code becomes _more_ complex when
you use dynamic memory.

Besides, malloc isn't part of C - it's part of the support library.
If your Fortran has access to a LOC function (I have written LOC
functions for systems that _didn't_ have it), and if you can turn
off array bounds checking, then Fortran can even use malloc!

C still has the problem that malloc() returns a pointer to the new
space and that any code using the dynamically allocated space has its
optimizations inhibited by that.  The Fortran 8x proposal has allocatable
arrays which _aren't_ associated with pointers.  This will permit the
code to be optimized just as if static arrays _only_ were used.

> [...] 
>>Fortran does some things real nice (try passing a variable dimensioned 3-d
>>array into a C function - UGH!
> double	arr1[10][20][30], arr2[40][50][60];
> main() {
> 	int	m1, m2, m3;
> 	double	*arr3;	/* really m1Xm2Xm3 */
> 	doSomething( &arr1[0][0][0], 10, 20, 30 );
> 	doSomething( &arr2[0][0][0], 40, 50, 60 );
> 	getSizes( &m1, &m2, &m3 );
> 	arr3 =	(double *)malloc( m1*m2*m3*sizeof(double) );
> 	doSomething( arr3, m1, m2, m3 );
>   ... }
> void
> doSomething( a, nlayer, nrow, ncol )
> 	double	*a;
> #		define A(i,j,k) a[ ((i)*nrow + (j))*ncol + (k) ]
> {
> 	...
> 	A(1,2,3) =	...
> 	...
> }

You're taking it on faith that your optimizer can do strength reduction
and get those multiplies out of loops.  Most Fortran compilers special
case the array indices - even if they don't otherwise do any strength
reduction.  I've seen a lot of C compilers that don't do strength
reduction at all.  This is simply an implementation issue though
and not a valid point of comparison for the two languages.

However, there _is_ a valid point to be raised here: you must define
macros such as above for each array argument that gets passed and
again in each other procedure that uses them.  Since this is extra
work that isn't required in Fortran, there is a potential for error
here that Fortran doesn't present.  For example: on my keyboard,
plus '+' and equal '=' are on the same key - suppose I mistype
the first plus in the macro as an equal?  The compiler won't catch
it!  (I use this example because I've actually seen it done - it took
several manhours to track down the problem.)

The "structured programming" rule that C violates in this case is:
"If it's something simple that the machine can do automatically,
the let the machine do it - don't take any chances on possible 
user error."

> [...]
> As illustrated, fortran's variable-dimensioned array arguments can be
> simulated in C with little effort.  Simulating dynamically allocated
> arrays in Fortran is extremely difficult.

Not at all!  The syntax compares one for one:

    C                         Fortran

   *p                         MEM(P)
   *(p+i)                     MEM(P+I)
   p=malloc(amount)           P=MALLOC(AMOUNT)
   ...                        ...

At least, for a properly written Fortran version of MALLOC.  Note that
the syntax is nearly identical except '*' is spelled 'MEM' and the
parenthesis are required in dereferencing even for just the pointer.
Of course, the Fortran compiler is required to do some things that
are automatic with C (like scaling the pointer arithmetic), but if
the previous example was considered an adequate replacemment for
Fortran arrays, then this should do nicely as a replacement for
C's pointers.

> [...] 
>>The optimizers are generally
>>VERY good at turning array reference operations into the equivalent C pointer
>>reference.

I disagree with this.  The compiler sould be able to do _better_ than the
C pointering scheme does - no pointer aliasing!

> [...]
> They can not reasonably perform this optimization for arrays of indices,
> such as might be used in a pivoting scheme for Gaussian elimination.

A survey of 100+ production programs indicates that over 90% of all array
indexes are simple loop itertation variables or linear combinations of
loop iteration variables.  The gather/scatter problem you refer to _is_
important.  Nevertheless, a compiler which 'special cases' array index
calculations will find a lot of reward in execution speed.

>>This argument will weaken as years go by, but it could very well be a
>>different language than C that we will be discussing.  I guarantee that
>>Fortran will still be the 'other' language discussed.
> 
> Fortran what? 4?, 77?, eighty-twelve?
> [...]

Actually, it makes a LOT more sense for a Fortran shop to wait for
Fortran 8x than it does for them to switch to C.  The conversion
would be easier, for starters.  I currently oppose the 8x proposal
because it contains features which will severly cripple the language.
Assuming they get rid of these though, the proposal will address all
the deficiencies you remarked upon - somtimes in ways considerably
better than what's available in C (like allocatable arrays).

Frankly, I don't see any reason for someone who is satisfied with
C to switch to Fortran either (any version).  Neither language is
_better_ than the other.  They are simply different.  Each has
strengths and weaknesses that the other doesn't.  I certainly would
not recommend that working code in _either_ language be targeted
for conversion into something new until such time a a language comes
along which really _is_ better than these are.

> [...]                          The "other language", in a few years
> will probably be C++ (unless I'm being overly optimistic), which has
> much more in common with C than F-whateve-comes-after-77 has in common
> with F77.

You don't sound optimistic to me.  C++ is not very good.  Objective C
is marginally better.  I was hoping for a language without C's awful
syntax and with real data structuring ability (which neither C nor
Fortran have).  For one thing, I'd like to be forced into using pointers
only when pointers are what I really want.  Object Oriented programming
features are welcome when they become muture - preferably with a _much_
cleaner implementation than C++ provides.  Polymorphism and encapsulation
can be better provided by generic functions with late-binding permitted.
If you could add inheritance without going to objects, it might make a
better language in the long run.