Some hacks I'll share!
Gregory R. Travis
greg at isrnix.UUCP
Fri Mar 9 19:10:51 AEST 1984
I was doing some playing this evening and guess what I found out:
1) On the PDP 11/44 a floating point (double precision)
clear (8 bytes) is almost exactly twice as fast as
4 clr (integer clear (2 bytes each)) instructions.
I replaced the code in clrbuf (in bio.c) with
floating point clears for a code speedup.
2) A floating point load (double prec. again)
followed by a floating point store is just a weeeee
bit faster than the appropriate number of 'mov'
instructions (assuming the cache is disabled).
I'll bet on the 11/70 you could use floating point
load/stores for twice the speed over conventional
mov's.
What the h*ll does this mean? That for some applications involving
manipulation of blocks of data, it may be keen-o to use the floating
point processor for the manipulations. Super-cool 11 floating point
processors (like the FP-11C in the 11/70 and FP-11E in the 11/60)
that operate in parallel with the CPU may give you quite a performance
boost if you play your cards right.
Can anyone see problems with this scheme? Has anyone thought of it
before?
Does anyone run a 44 or 24 with the commercial instruction set
option? If you do, do you use the block character move instructions?
Here at isrnix I wrote some code that copies kernel buffers to/from the
users address space with 'mov' instructions (the scheme plays with the
segmentation registers) instead of the slow m[t,f]p[d,i] instructions.
It would be a thrill to see if I could pop a CIS board in our CPU and
use the block move instruction and see what kind of a performance
increase I get. Even with the current situation I get better than
twice the performance in copying buffers than the previous copyin/copyout
scheme.
Any comments?
--
Gregory R. Travis
Institute for Social Research - Indiana University - Bloomington, In
ihnp4!inuxc!isrnix!greg
{pur-ee,allegra,qusavx}!isrnix!greg
More information about the Comp.unix.wizards
mailing list