A study in code optimization in C

Sat Jul 28 11:47:07 AEST 1990

In article <1990Jul26.144134.16053 at ux1.cso.uiuc.edu>, mcdonald at aries.scs.uiuc.edu (Doug McDonald) writes:
> In article <133 at smds.UUCP> rh at smds.UUCP (Richard Harter) writes:

> >The macro shown below is an optimized memory to memory copy macro.
> >It is probably faster than memcopy on your machine -- I have checked
> >it on several machines and have always found it to be faster.
>                                  !!!!!!

> Oh My!.

	[Superior timings for a 20KB move on a 386 by memmove given]

Ouch.  I should have phrased that more carefully.  Yes, the gentleman
is quite right.  As to be expected, a hardware bulk move is always
going to beat code that uses item by item move instructions.  Furthermore,
there is nothing to stop a system implementor from using the tightest
code possible. (My observation is that the quality of implementations
vary a great deal.)

In defense I have to point out that the quoted remark is accurate;
timings were made on 680x0 boxes, vaxes (!), and some risc boxes,
but not, obviously, on any 386 boxes.  Small comfort for the chap
who took my posting blindly and put it on his 386.  Apologies.

It should also be pointed out that the cited code is a template
for tight loops and does win when there isn't a hardware instruction
available.

Does it matter?  Yes, if you are coding for both speed and portability.
In this particular case, however, the gencopy macro should include a
machine dependent ifdef which switches to memcpy (memmove) on the
appropriate machines.

Thanks to Doug for catching this and posting.
-- 
Richard Harter, Software Maintenance and Development Systems, Inc.
Net address: jjmhome!smds!rh Phone: 508-369-7398 
US Mail: SMDS Inc., PO Box 555, Concord MA 01742
This sentence no verb.  This sentence short.  This signature done.