C "optimization" (1 of 8)
Dan Klein
dvk at mi-cec.UUCP
Wed Feb 15 05:19:39 AEST 1984
This is a continuation of my diatribe on "C doesn't optimize, it neatens".
In this and other articles, I compare a true optimizing compiler (Bliss-32
running under VMS) to a code neatener (C running under BSD 4.1c). Any and
all counterexamples are welcome. However, this is NOT a comparison of the
languages. Both C and Bliss have their good and bad points. This is simply
a comparison of the code they generate. As in all examples, the source
code and uncensored assembly code is presented. In all examples, the C source
and Bliss source are as nearly identical as language differences permit. I
have not taken advantage of any "tricks" to get either language to perform
better or worse than the other. The optimizer was enabled for both languages.
-Dan Klein, Mellon Institute, Pittsburgh (412)578-3382
=============================================================================
In this comparison, I present a test which selects between two routines,
each of which has the identical arguments passed to it. Now, since either
one or the other routine is called (i.e. flow control analysis can readily
determine that there is no condition under which neither or both routines
are called), a great optimizer would first push all the arguments onto the
stack, and then test the condition "t", and call the appropriate routine.
Regrettably, even Bliss is not smart enough to do this. However, it is
smarter than C in that:
1) It uses two MOVQ instructions to load the procedure arguments
instead of 4 PUSHL instructions (this is a space optimization, but gives
a small speed optimization since there are two less instructions to execute).
If Bliss were assured of the target processor, it could use a MOVO (move
octaword). However, the 11/730 doesn't support that instruction, so Bliss
avoids it.
2) C uses a common exit point from the routine. Since no cleanup
code is used, the "jbr L19" is wasteful. Bliss instead simply generates
two RET instructions where required. This is a drawback of the peephole
optimizer that C uses. It can eliminate jumps to jumps, but it does not
eliminate jumps to returns. It should, since that optimization is both a
speed and space improvement.
----------------------------------------+-------------------------------------
external routine alpha; | extern int alpha();
external routine beta; | extern int beta();
|
routine test(p,q,r,s,t) : novalue = | test(p,q,r,s,t)
begin | {
if .t neq 0 | if (t!=0)
then alpha(.p,.q,.r,.s) | alpha(p,q,r,s);
else beta(.p,.q,.r,.s); | else
end; | beta(p,q,r,s);
| }
|
.TITLE FOO | .data
| .text
.PSECT $CODE$,NOWRT,2 | LL0: .align 1
| .globl _test
TEST: .WORD ^M<> | .set L14,0x0
TSTL 20(AP) | .data
BEQL 1$ | .text
MOVQ 12(AP), -(SP) | _test: .word L14
MOVQ 4(AP), -(SP) | tstl 20(ap)
CALLS #4, W^ALPHA | jeql L18
RET | pushl 16(ap)
1$: MOVQ 12(AP), -(SP) | pushl 12(ap)
MOVQ 4(AP), -(SP) | pushl 8(ap)
CALLS #4, W^BETA | pushl 4(ap)
RET | calls $4,_alpha
| jbr L19
| L18: pushl 16(ap)
| pushl 12(ap)
| pushl 8(ap)
| pushl 4(ap)
| calls $4,_beta
| L19: ret
More information about the Comp.lang.c
mailing list