Faster C
Thomas Truscott
trt at rti.UUCP
Sun Feb 7 20:22:12 AEST 1988
In article <473 at aati.UUCP>, fish at aati.UUCP (William G. Fish) writes:
> I wish to make the following C code run as fast as possible under 4.1 BSD
> on a VAX 11/750. ...
The code performs two different functions:
It copies an array of short integers, converting to float,
and it computes the largest absolute value in the array.
Alas, the VAX 750 lacks the vector instructions that could do it,
and the code presented was quite good to begin with,
so all that is left are small tweaks. I think.
Building on what Chris Torek suggested:
1. Change "out[s]" to "*out++", eliminate the "s" variable,
and count "S" down instead.
2. Change 'in[c]; c += C' to '*in; in += <horrible thing>'.
(It would be nice if C provided a cleaner way to do this hack.)
3. Tweak the register declarations a bit.
Here is the revised routine, it might still work!
scan(in, out, c, C, S)
register short *in;
register float *out;
int c;
register int C;
register int S;
{
register int sample;
register int peak;
C = (sizeof(short)/sizeof(char))*C; /* eternal damnation awaits ... */
/* XXX might be good to initialize peak */
in += c;
while (--S >= 0)
{
sample = *in;
*out++ = sample; /* short to float conversion */
in = (short *)(((char *)in) + C); /* ... in the abyss */
/* vax seems to lack an absolute value instruction, sigh */
if (sample < 0)
sample = -sample; /* absolute value */
if (peak < sample)
peak = sample; /* peak detection */
}
return peak;
}
4. Touch up the assembler output (Gag me):
18,19c18
< cvtld r7,r0
< cvtdf r0,(r10)+
---
> cvtld r7,(r10)+
27,28c26
< L16: decl r8
< jgeq L2000001
---
> L16: sobgeq r8,L2000001
5. Learn more about the original problem.
a) What fraction of the time is sample < 0?
Perhaps we should keep "minpeak" and "maxpeak" variables
so we need never negate "sample".
b) What values can "sample" take on?
If it is in the range -5000..5000 we can
use table lookup to compute absolute value.
If it is in the range -31..31 we could use table lookup
to obtain bit masks that we just OR together,
then locate the highest bit to determine "peak".
(Okay, I'm weird.)
6. A VAX 750 is slow. Buy a faster machine. Brute force is beautiful.
Tom Truscott
More information about the Comp.lang.c
mailing list