Faster C
Chris Torek
chris at mimsy.UUCP
Sat Feb 6 16:13:51 AEST 1988
>In article <473 at aati.UUCP> fish at aati.UUCP (William G. Fish) writes:
>>I wish to make the following C code run as fast as possible under 4.1 BSD
>>on a VAX 11/750.
declarations:
>> register short *in;
>> register float *out;
>> register c;
>> int C, S;
>> register sample, s;
>>
loop:
>> for (s = 0; s < S; s++, c += C) {
>> out[s] = sample = in[c]; /* short to float conversion */
In article <4177 at june.cs.washington.edu> pardo at june.cs.washington.edu
(David Keppel) writes:
>If you change the out[s] and in[c] to use a pointer that is incremented
>each iteration, you may be able to save yourself an ashl each time.
The `out[s]' and `in[c]' should generate VAX `subscript' mode instructions,
something like
cvtwl (in)[c],sample
cvtlf sample,(out)[s]
and indeed, feeding the equivalent through /lib/ccom produces
cvtwl (r11)[r9],r7
cvtlf r7,(r10)[r8]
A pointer version might be a wee bit faster for `out':
for (s = 0; s < S; s++, c += C) {
*out++ = sample = in[c];
or
cvtwl (r11)[r9],r7
cvtlf r7,(r10)+
One more tiny gain is to loop down to zero instead of up to S:
loop:
for (s = S, out += s, c += C * s; c -= C, --s >= 0;) {
*--out = sample = in[c];
...
If the loop is short enough (8 instructions or less), the
optimiser (/lib/c2) will turn the decrement/test/branch into
a `sobgeq' instruction. It looked as though the loop was
not that short. Still,
decl rN
bneq loop
will be ever so slightly faster than
incl rN
cmpl rN,-S(fp)
blss loop
on a 750. Since in the original code fragment `c' did not count
up from zero, we still need a counter like `s'.
--
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain: chris at mimsy.umd.edu Path: uunet!mimsy!chris
More information about the Comp.lang.c
mailing list