faster bcopy using duffs device (source)
stergios marinopoulos
stergios at Jessica.stanford.edu
Fri Sep 8 10:52:37 AEST 1989
I wanted a faster bcopy, so I used duffs device as a basis for it. In
addition, it copies ints at a time instead of chars, and the loop is
unrolled a little too. Its been working well for me today, so it has
to be perfect right?
I have been seeing 4X speed ups, so I thought I would pass it along.
A potential problem is the char*'s not being alligned, but I have not
run into it. Also, this probably will not copy strings smaller than
32 bytes (no problem for me, I wanted to copy megs-o-stuff.)
Let me know what you think. Of the code or anything else for that
matter.
sm
**********************************************************************
#define IFACTOR 4
dcopy(chardest, charsrc, size)
char *chardest, *charsrc ;
int size ;
{
register int *src, *dest, intcount ;
int startcharcpy, intoffset, numints2cpy, i ;
numints2cpy = size >> 2 ;
startcharcpy = numints2cpy << 2 ;
intcount = numints2cpy & ~(IFACTOR-1) ;
intoffset = numints2cpy - intcount ;
src = (int *)(((int) charsrc) + intcount*sizeof(int*)) ;
dest = (int *)(((int) chardest) + intcount*sizeof(int*)) ;
/* copy the ints */
switch(intoffset)
do {
case 0: dest[3] = src[3] ;
case 3: dest[2] = src[2] ;
case 2: dest[1] = src[1] ;
case 1: dest[0] = src[0] ;
intcount -= IFACTOR ;
dest -= IFACTOR ;
src -= IFACTOR ;
} while (intcount >= 0) ;
/* copy the chars left over by the int copy at the end */
for(i=startcharcpy ; i<size ; i++)
chardest[i] = charsrc[i] ;
}
More information about the Alt.sources
mailing list