IOCALL results and problems

Mon Dec 16 22:28:05 AEST 1985

> In article <354 at ncr-sd.UUCP> stubbs at ncr-sd.UUCP (0000-Jan Stubbs) writes:
> >
> >	IOCALL, A UNIX SYSTEM PERFORMANCE BENCHMARK
> >... The benchmark is a "C" program which measures Unix kernel performance. 
> 
Dan Tsao writes:
> 	Well I don't want to flame too much. Just a few comments.
> 
> 	Basically, I find it difficult to take this benchmark and the presented
> results too seriously.
> 
> 	- I have trouble understanding the point of the benchmark program.
> ...  It's not even something a normal user can relate to,
> such as "copying files on a X is twice as fast as Y".
> 
> 	- It is obviously a single point measurement. It can tell you very
> little about how particular applications or the system in general will run.
> 
> 	- The numbers are way to small to interpret with any substantial
> significance (i.e. you should run the benchmark with say 10000, rather than 1000
> in the the loop). The difference between the various VAX 11/750 times are,
> for example 7.2 to 9.4 . I could be convince there is significance there, but...
> 
> 	- That a Radio Shack 16A performs 25% better than a VAX 11/750 is cute
> but little practical interest (read ridiculous, a benchmark that tells me that
> is probably not going to be very useful, are we really to think that an
> Amdahl 470/V8 is only 12% faster than a VAX8600, that a Pyramid is slower than
> a VAX 11/780).

a) I agree it doesn't measure everything, but it does check three important
aspects that affect overall system performance: context switch costs,
copying costs, and the cost of finding the buffer in the buffer cache.
b) You want to avoid using the disks, since after all, an IBM PC with a
fast hard disk would probably outperform an 8600 with an RK05.
Thus, the statement "system A copies files twice as fast as system B"
is only useful knowing the I/O configuration (was it massbus/unibus disks
on a Vax?, what type disks, ....).
c) I agree, run the benchmark with more times through the loop on fast
machines.  1000 is probably enough on small machines.
d) The point about the benchmark results is not that they are ridiculous,
but that they might show up areas which need work.  For example, if
you simply port UNIX to a large machine and increase the number of buffers
without thinking about the way the buffer cache works, you are likely to
find that you have, say, 1024 buffers chained into 60 queues.  Whereas on a
pdp11 you had 60 buffers in 60 queues.  Which one will take less time to
find a buffer in?  Raw machine speed alone won't tell you the answer.
Further, lets suppose you built a machine with lots of registers and a
load/store architecture (i.e. RISC, Pyramid).  It turns out the cost of
doing a context switch is higher (save all registers) and the load/store
architecture is at its worst on doing memory to memory copies.  Thus,
a pyramid might very well do worse than a Vax 11/780.  I timed a long to
long copy on a pyramid in user mode, it was only 1.15 * the 11/780.
Given that the pyramid has a slow context switch....
e) The variation among machines of the same model is real, we have two
780's and one is consistently about 5% faster on benchmarks.  We have
two Pyramids and again, one is consistently faster on the same benchmarks.
One should always take +/- 10% on benchmarks to compare machines.

Rich Hammond, Bell Communications Research