Optimal for loop on the 68020.
Chris Torek
chris at mimsy.UUCP
Tue Jun 6 06:11:24 AEST 1989
In article <11993 at well.UUCP> pokey at well.UUCP (Jef Poskanzer) writes:
>... COUNT was a small (< 127) compile-time constant.
> for ( i = COUNT; --i >= 0; )
[all but gcc -O -fstrength-reduce deleted]
> moveq #COUNT,d0
> jra tag2
>tag1:
> <loop body>
>tag2:
> dbra d0,tag1
> clrw d0
> subql #1,d0
> jcc tag1
>... But wait! What's that chud after the loop? Let's see, clear d1
>to zero, subtract one from it giving -1 and setting carry, and jump
>if carry is clear. Hmm, looks like a three-instruction no-op to me!
No---the problem is that `dbra' decrements a *word*, compares the
result against -1, and (if not -1) braches. The semantics of the
loop demands a 32 bit comparison. The only reason it is not necessary
in this particular case is the first quoted line above.
Still, it would be nice if gcc always used the dbra/clrw/subql/jcc
sequence for `--x >= 0' loops, since it does always work. The `clrw'
fixes up the case where the 16-bit result has gone to -1:
before decrement: wxyz 0000
after decrement: wxyz FFFF
after clrw: wxyz 0000
after subql: wxyz-1 FFFF
The dbra loop is so much faster that the extra time and space for one
`unnecessary' dbra+clrw (when the loop really does go from 0 to -1,
and at every 65536 trips when the loop counter is large and positive)
that I would make this optimisation unconditional.
--
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain: chris at mimsy.umd.edu Path: uunet!mimsy!chris
More information about the Comp.unix.wizards
mailing list