no noalias not negligible - a difference between C and Fortran - long
Niels J|rgen Kruse
njk at diku.dk
Thu May 26 06:12:08 AEST 1988
In article <54080 at sun.uucp>, dgh%dgh at Sun.COM (David Hough) writes:
>(...)
> it appeared to still be written in Fortran. But faithful preservation of
> Fortran semantics, including memory access patterns, was one of the goals
> of the translation.
Since nobody else commented on this ...
If you didn't want to preserve memory access patterns so badly,
you could have done some hand scheduling on the unrolled loop :
(similarly simplified)
daxpy(n, da, dx, dy )
double dx[], dy[], da;
int n;
{
int i;
double a,b,c,d;
for (i = 0; i < n; i++) {
/*
* Compute 4 independent expressions into
* registers a,b,c,d.
*/
a = dy[i] + da * dx[i];
b = dy[i+1] + da * dx[i+1];
c = dy[i+2] + da * dx[i+2];
d = dy[i+3] + da * dx[i+3];
/*
* Store results back.
*/
dy[i] = a, dy[i+1] = b, dy[i+2] = c, dy[i+3] = d;
}
}
This alleviates the constraints imposed on scheduling by
potential aliasing. The first store can be scheduled as soon as
the last load has completed. Given that a sufficient number of
registers are available for scheduling loads into, enough time
should be left before computations terminate to catch up on the
stores. If the fortran compiler doesn't unroll more than 4 times,
i see no reason why this should be slower than the rolled fortran
version.
On the other hand, i see no reason why the unrolled fortran
version should be slower than the rolled version either,
so what do i know.
Niels J|rgen Kruse
More information about the Comp.lang.c
mailing list