Execution time bottleneck: How to speed up execution?
Dik T. Winter
dik at cwi.nl
Fri Feb 15 09:00:14 AEST 1991
In article <26862:Feb1416:46:4391 at kramden.acf.nyu.edu> brnstnd at kramden.acf.nyu.edu (Dan Bernstein) writes:
> Dik, the optimizations I posted in a previous article are responsible
> for between a 4% and a 50% speedup, depending on your machine, your
> compiler, etc.
The optimizations you gave in your first article gave between 4 and 10% speedup,
not to 50%. To break down the different speedups on a Sun SLC, comparing
O'Keefe's variant with your 5 versions (done with n=1000, 5 times; time in
seconds):
version time variant
OK 55.93 O'Keefe's variant
1 55.15 1.4% speedup (using tmp for xi-xj, including c)
2 53.37 4.6% speedup (looping down, not up)
3 53.30 4.7% speedup (register variable for a[j])
4 53.87 3.7% speedup (pointer for a+i and y+j)
5 53.82 3.8% speedup (vectorizable code)
And yes, I consider this micro optimization.
> As I said the first time, the 10% is the speedup you get on a Convex
> with the standard math library exp() when you apply the ``ludicrous''
> optimizations I pointed out. It is not due to vectorization.
Might be. Does the Convex vectorize all five variants?
>
> Yes, and I could equally well have said ``buy a Cray.'' If the original
> poster didn't have a Cray this would result in ``large improvements.''
Sure. Times on a Cray Y/MP:
version time remarks
OK 0.3458 original
1 0.3282 5.1% speedup
2 0.3292 4.8% speedup
3 0.3312 4.2% speedup
4 0.3717 7.5% slowdown
5 0.3469 0.3% slowdown
Although the compiler needed a bit of persuasion to vectorize the original
and versions 1 and 2. So also here: micro optimization. (Calling version
4 and 5 optimized is even stressing the meaning of the term a bit!)
> Similarly, the code becomes quite a lot faster if the poster uses a fast
> exp()---but do you really think that ``use a fast exp()'' is any more
> helpful than ``buy a Cray''? No. Neither one answers the question.
But 'use fast exp()' is much more cheaper than 'buy a Cray'! And it can
be handled by an adqeuate programmer.
> Furthermore, if the poster *does* have a fast exp() running on his Cray,
> the optimizations I posted will give an even better speedup.
I doubt it very much.
--
dik t. winter, cwi, amsterdam, nederland
dik at cwi.nl
More information about the Comp.lang.c
mailing list