Unrolling string copy loop
Radford Neal
radford at calgary.UUCP
Tue Apr 2 07:33:46 AEST 1985
> sym.1:
> movb (r2)+,(r1)+
> bneq sym.1
> By the way, Colonel, this loop is not improved by unrolling.
WRONG! I timed the following two routines:
# String copy with ordinary loop.
_sc1: .word 0
movl 4(ap),r1
movl 8(ap),r2
1: movb (r1)+,(r2)+
bneq 1b
ret
# String copy with unrolled loop.
_sc2: .word 0
movl 4(ap),r1
movl 8(ap),r2
1: movb (r1)+,(r2)+
beql 2f
movb (r1)+,(r2)+
beql 2f
movb (r1)+,(r2)+
beql 2f
movb (r1)+,(r2)+
beql 2f
movb (r1)+,(r2)+
beql 2f
movb (r1)+,(r2)+
beql 2f
movb (r1)+,(r2)+
beql 2f
movb (r1)+,(r2)+
beql 2f
movb (r1)+,(r2)+
beql 2f
movb (r1)+,(r2)+
bneq 1b
2: ret
The first takes 120 microseconds to copy a thirty character string. The
second takes only 100 microseconds.
Seems that branches not taken are faster than branches which are taken.
Radford Neal
The University of Calgary
More information about the Comp.lang.c
mailing list