low level optimization
rew
wilber at alice.att.com
Fri Apr 26 14:05:50 AEST 1991
J. Giles writes:
>> [...] but why is it so horrible to you to document
>> to the users of your optimized libraries that certain
>> arguments to certain functions must be carefully
>> non-aliased? [...]
>
>It's not horrible. You miss my point if you believe that I would
>oppose that. You give me a portable way to declare to the compiler
>that two (or more) arguments are assumed not to be aliased and I will
>rejoice.
>
>The problem is that what people are recommending is _not_ portable
>and is _not_ an explicit declaration (so the end user has to guess
>which variables the compiler optimized in this way and which are
>safe to alias).
I like C. I don't like Fortran. I find the likelihood that people will still
be coding in Fortran in the year 2020 to be depressing. Having said that, I
must say that in my experience the criticism that J. Giles makes of C, that
problems with potential aliasing make it *very* difficult for optimizing
compilers to do their job, is a legitimate one. I am working on a large
program that does schedule optimization. Speed is critical. Parts of it are
written in C, and most is written in C++, which is translated into C by cfront.
(BTW, cfront exacerbates the problems with aliasing by introducing large
numbers of spurious pointers.) What I find is that Fortran code regularly runs
circles around carefully coded C and C++. The problems intensify on RISC
machines, which are becoming all the rage, because of their dependence on
keeping data in registers for speed. On an IBM RS6000 a little test program I
wrote in C runs about 6 times slower when I use pointers to scalars than when I
use local scalars directly. One might expect that a factor of 2 could be due
to the need to dereference the pointers, the rest is because most of the
optimizations have to be turned off as soon as pointers are used.
Unfortunately, once you get past the realm of toy test programs storing all
your data in local variables is a non-option -- most of the data is lurking in
giant arrays and malloced structures, all accessed through pointers.
I think that the pro-C camp has been too willing to sweep under the rug the
difficulties of making a C compiler deal with aliasing. Sure, global
optimization across all 200 .c modules would solve many of the problems, but
how many C compilers actually do that? Can you name 3 machines I can buy right
now that have such a beast? And when you try to do that sort of thing, you
quickly bump up against the limits of your machine. (Can you say, "Virtual
memory exhausted"? Can you say, "Two day compilation"?)
Cross procedural optimization within a module doesn't buy you as much as you
might hope; simply knowing that none of the calls to foo() within foo.c alias
the parameters doesn't help if foo() is declared external (the default) because
some call to foo() in another module might alias the parameters. And I'd say
that less than 10% of all functions in typical C code are declared static.
(Even when functions aren't called outside of their modules, programmers rarely
bother to declare them static.)
So what *can* be done? As has already been pointed out, it is not unreasonable
for a compiler to assume that non-char, non-void pointers of different types
are not aliased. An option like -this_programmer_should_be_fired should of
course be available to turn this assumption off, for things like compiling code
taken from the net. I would go further and allow an option called, say,
-strict_typedefs that would tell the compiler that in a case like this:
typedef int Apple;
typedef int Orange;
that an int can be an Apple or an Orange (or neither) but an Apple can't be
an Orange and an Orange can't be an Apple. Thus in
typedef long Cents;
typedef long Quantity;
Cents calculate_costs(Cents c[], Quantity q[]);
the compiler knows that c and q aren't aliased. This would of course have
to be an option because there is nothing in the definition of the language
that requires that typedefs should behave this way.
But what can be done about cases like this?
foo(char* p, char* q, char* s);
One can of course come up with various schemes involving macros or comments
that inform your favorite compiler that the parameters aren't aliased, but the
problem is that there are no standards for how this is to be done, so any
such information is likely to be understood *only* by your favorite compiler,
and no other.
Right now, in the Real World (TM), optimized Fortran really does run faster
than optimized C. A situation that truly sucks.
Bob Wilber wilber at homxb.att.com
More information about the Comp.lang.c
mailing list