profil questions
aglew at ccvaxa.UUCP
aglew at ccvaxa.UUCP
Sat Feb 13 11:00:00 AEST 1988
I've been trying to go beyond the obvious uses of profil(2), and have
certain questions and wonderings:
(1) profil(buff,...)
char *buff;
On the systems I've looked at, buff is treated as an array of shorts.
Shouldn't UNIX be honest and say short *buff?
Most systems I know have 16 bit shorts. Now, it occurs to me that I
might like to profile some really long running programs, that run
more than, say, 3 days - easily long enough to overflow the profiling
bins. As a start, shouldn't we say
typedef short *profilbinT?
(2) Exactly what is the correspondence between profil bins and code locations
and actual instructions, particularly when scaling?
The man page says something about scale=0x10000 implying a one-to-one
correspondence between words of code and words (I assume counting bins)
in the buffer.
Now, on some systems an instruction can begin at an arbitrary byte
location. Does this mean that I should use a scale of 0x20000 to make
sure that I get a counting bin for the beginning of every possible
instruction on such a machine (eg. a VAX)?
The man page says that scale = 0x8000 maps each pair of words of code
to a word in the buffer. Again, I assume that these are 16 bit words,
and words in the buffer refer to short counting bins.
I have observed the following correspondence of byte offsets from
the base code location to bin numbers, using 0x8000, "2 to 1":
W 0 - Byte 0 - Bin 0
Byte 1 - Bin 0
W 1 - Byte 2 - Bin 1
Byte 3 - Bin 1
W 2 - Byte 4 - Bin 1
Byte 5 - Bin 1
W 3 - Byte 6 - Bin 2
What is the rationale here? It makes sense if instructions are 16
or 32 bits, and the sampled PC points to the next instruction, on a
machine that requires 32 bit instructions to begin on a 32 bit boundary,
because that means that in a sequence I16.I16.I32, the two adjacent
16 bit instructions get counted in the same bin; but it doesn't seem
to make sense on a machine like the VAX. Now, the same code is present
on the 3B2 - does it go all the way back to a PDP-11?
(3) How do the System V scale factors relate to the BSD scale factors?
ie SV 0177777 <-> 0x10000. Just add 1?
(4) What's all this scale garbage anyway? I'm sure that it was a lot cheaper
on a small machine, but 16 bits just isn't capable of expressing some
of the fractions that might be appropriate to deal with on a large machine.
Say I have a 256M text program that I want to divide into 4 counting bins
- can I do that with profil? Maybe I can't give it 64K of counters.
Maybe the scale argument should be made into a floating point number.
But single precision floating point may only give you 6-8 decimal digits
of accuracy, not enough to scale properly on *really* large programs.
Maybe the scale argument should be a shift factor, specifying
the power of two to divide by?
Or maybe there should be no scale argument, but just a
(CodeBottom,CodeTop) pair, and you let the system decide on what an
appropriate representation is. After all, you are guaranteed that the
addresses are representable, in as portable a form as a C pointer
provides.
Andy "Krazy" Glew. Gould CSD-Urbana. 1101 E. University, Urbana, IL 61801
aglew at gould.com - preferred, if you have nameserver
aglew at gswd-vms.gould.com - if you don't
aglew at gswd-vms.arpa - if you use DoD hosttable
aglew%mycroft at gswd-vms.arpa - domains are supposed to make things easier?
My opinions are my own, and are not the opinions of my employer, or any
other organisation. I indicate my company only so that the reader may
account for any possible bias I may have towards our products.
More information about the Comp.unix.wizards
mailing list