sscanf absurdity
Michael Condict
mnc at m10ux.UUCP
Wed Jul 13 03:41:31 AEST 1988
I'm not sure whether this belongs here or in comp.unix.wizards, but it is
probably a problem with most implementations of the C stdio library, so here
it is:
Many of you are probably aware of the bad reputation that sscanf has w.r.t.
execution time, especially since it is doing no I/O, right? Wrong! The
AT&T Sys V Rel 2 implementation of sscanf (and presumably earlier versions)
DOES do I/O, or at least it tries to. Look at sscanf in scanf.c and at
_filbuf in filbuf.c. Note that sscanf fakes up a FILE structure for the
purpose of allowing getc to be called on the string. It sets the _IOREAD
flag in the FILE structure to indicate that the string is read-only and
it sets the fd number to _NFILE, to indicate an illegal fd, i.e., that no
I/O should be done. Well, eventually, if getc runs off the buffer while
trying to satisfy a scanf format item, such as occurs during:
sscanf("1234", "%d", &i);
then _filbuf will be called to refill the buffer. It will not notice that
the _file field of the FILE struct is set to _NFILE and will actually call
read on the illegal file fd, causing an error return, not to mention
hundreds or thousands of wasted instructions. This can easily add 20%
additional CPU time to your process, if you are using sscanf repeatedly.
The fix is simple -- insert the following before the test of the _IOREAD
flag in _filbuf:
if ( iop->_file >= _NFILE) return(EOF);
I've just checked the BSD implementation and it doesn't have this problem,
so BSD Vaxen and Suns are probably okay. Amdahl UTS (System V Rel 1)
definitely does have the problem.
--
Michael Condict {ihnp4|vax135|cuae2}!m10ux!mnc
AT&T Bell Labs (201)582-5911 MH 3B-416
Murray Hill, NJ
More information about the Comp.lang.c
mailing list