Relative GL costs
Brian McClendon
bam at sgi.com
Fri Apr 5 11:40:22 AEST 1991
In article <9104041941.AA29344 at ge-dab.GE.COM> "dwilliam at larry.ATL.GE.COM"@andrew.dnet.ge.com writes:
>"Howard C. Smith" <smith at nextone.niehs.nih.gov> writes:
>> Does anyone have numbers as to the relative cost of
>> particular GL calls? (for each machine in the 4D series). Maybe all
>> normalized as a percentage of gconfig (presumably the most
>> expensive).
>>
>> Howard Smith
>> smith at nextone.niehs.nih.gov
>>
>
>/*
> * this might be what you are looking for.
> * let me know if you make any interesting enhancements.
> * compile with:
> * cc -prototypes -acpp -O -s glbench.c -lm -lgl_s -lc_s -o glbench
> *
> * dan (dwilliams at atl.ge.com)
> *
> * GL benchmarking results sorted numerically for a 210GTX:
> *
> * swapbuffers : 61 calls per second
It's hard to derive a true cost for a GL routine when it involves
the hardware gfx pipeline. Because the bottleneck can be deep in the
pipe and lots of FIFO-ing inbetween, pixie/prof results _can_ be
very misleading.
If you write a benchmark prg (like glbench.c) and run the same
primitive over and over, then you _should_ get a reasonable idea
of the cost of a particular primitive (as long as you do a finish()
to flush the pipe or do enough iterations that the depth of the pipe is
insignificant).
Unfortunately there are exceptions to the above. Swapbuffers & gsync
wait for the next vertical retrace, so benchmarking them is difficult.
I do know they each make a system call, but the whole routine shouln't
take more than 100 usecs itself (leaving you 16.56... msec to draw at a
60hz framerate).
Also, benchmarking mapcolor on some machines is difficult due to the
way mapcolor was microcoded. Here are some real numbers for mapcolor
performance.
VGX: 31750 slots/sec
GTX: 7400
G: 2200
PI: 4000
The problem with these is that their inverse is _not_ the cost of
the routine on most machines because when inserted in a stream of
unrelated cmds (that happen not to tickle the same bit of hardware)
the cost may drop down to a usec or less.
On a dumb frame buffer most of this would be very easy because there
is only one processor, but on the VGX there can be 11, some in parallel,
some in series.
--
----------------------------------------------------------------------------
Brian McClendon bam at rudedog.SGI.COM ...!uunet!sgi!rudedog!bam 415-335-1110
----------------------------------------------------------------------------
More information about the Comp.sys.sgi
mailing list