micro-optimizing loops (was Help with casts)

Tue Feb 26 06:46:16 AEST 1991

(with any luck this will die its own death after this...)

In article <344 at smds.UUCP> rh at smds.UUCP (Richard Harter) writes:
>For reasons that are not clear to me many optimizing compilers will not
>collapse the two machine instructions
>		dec r1
>		bge 1$
>into the available single instruction to do the same thing.  Perhaps
>some of our compiler writers can explain this to us.

Certain machines (grr :-) ) that have subtract-and-branch-on-condition
instructions can only branch a very short distance; compilers for these
must figure out how far the branch goes, or else use assembler pseudo
ops like `jsobgtr' which expand if necessary.  Unfortunately, the VAX
(for one) assemblers tend not to have `jsobgtr' pseudo-ops.  `Fixed in
the next release....'

(Incidentally, I played with timing decl+jgeq vs sobgeq on the VAX and
found that it rarely made any difference.  It is more compact, which
does not hurt, but not really any faster.  Other `fancy' VAX
instructions also turn out to be slower than equivalent sequences of
simpler instructions.  Which ones, and how much, depend on the
particular model: 780s and 8250s have fairly different characteristics.)
-- 
In-Real-Life: Chris Torek, Lawrence Berkeley Lab EE div (+1 415 486 5427)
Berkeley, CA		Domain:	torek at ee.lbl.gov