Changes to Answers to Frequently Asked Questions (FAQ) on comp.lang.c
Steve Summit
scs at adam.mit.edu
Wed May 1 14:01:14 AEST 1991
It's the comp.lang.c FAQ list's first birthday, and practically
all it got this month was deletions, sad to say. (Worry not; the
seven elided questions were of secondary importance, and the
other deleted text was mostly redundant or tangential. Through
the magic of RCS and *roff "conditional compilation," all of the
deleted text is preserved, and will eventually reappear in a
"long form" list, to be available on request.)
The deletions pull the list back under 100K in size, to avoid
difficulties with some mailers.
Here is the usual set of diffs (edited for readability, and
unsuitable for the patch program) between the previous posting
(April 2) and the new one. (Do _not_ worry if you have not seen
the new one yet; it's coming up next.)
< [Last modified April 2, 1991 by scs.]
---
< [Last modified April 29, 1991 by scs.]
==========
< This article does not, and cannot, provide an exhaustive discussion of
< every subtle point and counterargument which could be mentioned with
< respect to these topics. Cross-references to standard C publications
< have been provided, for further study by the interested and dedicated
< reader.
==========
< some of the myths which this article attempts to refute. Several
< noteworthy books on C are listed in this article's bibliography.
---
> noteworthy books on C are listed in this article's bibliography. Many
> of the questions and answers are cross-referenced to these books, for
> further study by the interested and dedicated reader.
==========
When function prototypes are in scope, argument passing becomes an
< "assignment context," and casts may safely be omitted, since the
< prototype tells the compiler that a pointer is required, and of
---
> "assignment context," and most casts may safely be omitted, since
> the prototype tells the compiler that a pointer is required, and of
==========
< 3. But aren't pointers the same as ints?
<
< A: Not since the early days. Attempting to turn pointers into
< integers, or to build pointers out of integers, has always been
< machine-dependent and unportable, and doing so is strongly
< discouraged. (Any object pointer may be cast to the "universal"
< pointer type void *, or char * under a pre-ANSI compiler, when
< heterogeneous pointers must be passed around.) It is no longer
< guaranteed that a pointer can be cast to a "suitably capacious"
< integer and back, unchanged.
==========
Nevertheless, ANSI C allows the alternate
#define NULL (void *)0
definition for NULL. Besides helping incorrect programs to work
< (but only on machines with all pointers the same, thus questionably
---
> (but only on machines with homogeneous pointers, thus questionably
valid assistance) this definition may catch programs which use NULL
==========
< A: No. Although preprocessor macros are often used in place of
---
> A: No. Although symbolic constants are often used in place of numbers
because the numbers might change, this is _not_ the reason that
NULL is used in place of 0. Once again, the language guarantees
==========
< 11. I once used a compiler that wouldn't work unless NULL was used.
<
< A: Unless the code being compiled was nonportable (see question 6),
< that compiler was probably broken. In general, making decisions
< about a language based on the behavior of one particular compiler
< is likely to be counterproductive.
==========
most machines, as zero invites unwarranted assumptions. The use of
a preprocessor macro (NULL) suggests that the value might change
< later, or on some weird machine. The construct "if(p == 0)" is
< easily misread as calling for conversion of p to an integral type,
< rather than 0 to a pointer type, before the comparison. Finally,
< the distinction between the several uses of the term "null" (listed
< above) is often overlooked.
---
> later, or on some weird machine. Finally, the distinction between
> the several uses of the term "null" (listed above) is often
> overlooked.
==========
A: "Certain Prime computers use a value different from all-
bits-0 to encode the null pointer. Also, some large
Honeywell-Bull machines use the bit pattern 06000 to encode
< the null pointer. On such machines, the assignment of 0 to
< a pointer yields the special bit pattern that designates the
< null pointer."
---
> the null pointer."
-- Portable C, by H. Rabinowitz and Chaim Schaap,
Prentice-Hall, 1990, page 147.
==========
19. Then why are array and pointer declarations interchangeable as
function formal parameters?
A: Since arrays decay immediately into pointers, an array is never
< actually passed to a function. Allowing pointer parameters to be
< declared as arrays is a simply a way of making it look as though
< the array was being passed. Some programmers prefer, as a matter
< of style, to use this syntax to indicate that the pointer parameter
< is expected to point to the start of an array rather than to some
< single value.
<
< Since functions can never receive arrays as parameters, any
< parameter declarations which "look like" arrays, e.g.
---
A: Since arrays decay immediately into pointers, an array is never
> actually passed to a function. Therefore, any parameter
> declarations which "look like" arrays, e.g.
==========
< To repeat, however, this conversion holds only within function
< formal parameter declarations, nowhere else. If this conversion
< bothers you, don't use it; many people have concluded that the
< confusion it causes outweighs the small advantage of having the
< declaration "look like" the call and/or the uses within the
< function.
---
> This conversion holds only within function formal parameter
> declarations, nowhere else. If this conversion bothers you, avoid
> it; many people have concluded that the confusion it causes
> outweighs the small advantage of having the declaration "look like"
> the call and/or the uses within the function.
==========
multidimensional arrays, if at all. (See question 22 above.) When
people speak casually of a pointer to an array, they usually mean a
< pointer to its first element; the type of this latter pointer is
< generally more useful.
==========
(See also question 63.) If the size of the array is unknown, N can
be omitted, but the resulting type, "pointer to array of unknown
< size," is almost completely useless.
---
> size," is useless.
==========
< (In "real" code, of course, each return value from malloc should be
< checked.)
---
> (In "real" code, of course, malloc should be declared correctly,
> and each return value checked.)
==========
< The order of other embedded side effects is similarly undefined.
< For example, the expression i + (i = 2) does not necessarily yield
< 4.
==========
The behavior of code which contains ambiguous or undefined side
< effects has always been undefined. (Note, too, that a compiler's
---
> effects (including ambiguous embedded assignments) has always been
> undefined. (Note, too, that a compiler's choice, especially under
==========
< A: Two programs, protoize and unprotoize, are being written to convert
< back and forth between prototyped and "old style" function
< definitions and declarations. (These programs are _not_ expected
< to handle full-blown conversion between "Classic" C and ANSI C.)
< When available, these programs will exist as patches to the FSF GNU
< C compiler, gcc.
---
> A: Two programs, protoize and unprotoize, convert back and forth
> between prototyped and "old style" function definitions and
> declarations. (These programs do _not_ handle full-blown
> translation between "Classic" C and ANSI C.) These programs exist
> as patches to the FSF GNU C compiler, gcc. Look for the file
> protoize-1.39.0 in pub/gnu at prep.ai.mit.edu (18.71.0.38), or at
> several other FSF archive sites.
==========
The noalias keyword was not backed up by any "prior art," and it
was introduced late in the review and approval process. It was
phenomenally difficult to define precisely and explain coherently,
< and sparked widespread, acrimonious debate, including a scathing
< pan by Dennis Ritchie. It had far-ranging implications,
---
> and sparked widespread, acrimonious debate. It had far-ranging
==========
< implications, must be understood.) The need for a mechanism to
< support parallel implementation of non-overlapping operations
< remains unfilled (although the C Numerical Extensions Working Group
< is examining the problem).
---
> implications, must be understood.) The need for an explicit
> mechanism to support parallel implementation of non-overlapping
> operations remains unfilled (although the C Numerical Extensions
> Working Group is examining the problem).
==========
36. How can I write a generic macro to swap two values?
A: There is no good answer to this question. If the values are
integers, a well-known trick using exclusive-OR could perhaps be
used, but it will not work for floating-point values or pointers,
< (and the "obvious" supercompressed implementation for integral
< types a^=b^=a^=b is, strictly speaking, illegal due to multiple
< side-effects; and it will not work if the two values are the same
< variable, and...). If the macro is intended to be used on values
< of arbitrary type (the usual goal), it cannot use a temporary,
< since it does not know what type of temporary it needs, and
< standard C does not provide a typeof operator. (GNU C does.)
---
> (and it will not work if the two values are the same variable, and
> the "obvious" supercompressed implementation for integral types
> a^=b^=a^=b is, strictly speaking, illegal due to multiple side-
> effects, and...). If the macro is intended to be used on values of
> arbitrary type (the usual goal), it cannot use a temporary, since
> it does not know what type of temporary it needs, and standard C
> does not provide a typeof operator.
==========
The best all-around solution is probably to forget about using a
< macro. If you're worried about the use of an ugly temporary, and
< know that your machine provides an exchange instruction, convince
< your compiler vendor to recognize the standard three-assignment
< swap idiom in the optimization phase.
---
> macro, unless you don't mind passing in the type as a third
> argument.
==========
If all of the statements in the intended macro are simple
< expressions, with no declarations, conditionals, or loops, another
< technique is to write a single, parenthesized expression using one
< or more comma operators. (This technique also allows a value to be
< "returned.")
---
> expressions, with no declarations or loops, another technique is to
> write a single, parenthesized expression using one or more comma
> operators. (This technique also allows a value to be "returned.")
==========
< (If you know enough about your machine's architecture, it is
< possible to pick arguments off of the stack "by hand," but there is
< little reason to do so, since portable mechanisms exist. If you
< know how to access arguments "by hand," but have access to neither
< <stdarg.h> nor <varargs.h>, you could as easily implement
< <stdarg.h> yourself, leaving your code portable.)
==========
< 44. How can I write a function analogous to scanf?
<
< A: Unfortunately, vscanf and the like are not standard. You're on
< your own.
==========
A: This information is not available to a portable program. Some
< systems have a nonstandard nargs() function available, but its use
< is questionable, since it typically returns the number of words
---
> systems provide a nonstandard nargs() function, but its use is
> questionable, since it typically returns the number of words
==========
< 46. How can I write a function which takes a variable number of
< arguments and passes them to some other function (which takes a
< variable number of arguments)?
<
< A: In general, you cannot. You must provide a version of that other
< function which accepts a va_list pointer, as does vfprintf in the
< example above. If the arguments must be passed directly as actual
< arguments (not indirectly through a va_list pointer) to another
< function which is itself variadic (for which you do not have the
< option of creating an alternate, va_list-accepting version) no
< portable solution is possible. (The problem can be solved by
< resorting to machine-specific assembly language.)
==========
< That the "other half," better error detection, was deferred to
< lint, was a fairly deliberate decision on the part of the earliest
< Unix C compiler authors, but is inexcusable (in the absence of a
< supplied, consistent lint, or equivalent error checking) in a
< modern compiler.
==========
< 50. Don't ANSI function prototypes render lint obsolete?
<
< A: Not really. First of all, prototypes work well only if the
< programmer works assiduously to maintain them, and the effort to do
< so (plus the extra recompilations required by numerous, more-
< frequently-modified header files) can rival the toil of keeping
< function calls correct manually. Secondly, an independent program
< like lint will probably always be more scrupulous at enforcing
< compatible, portable coding practices than will any particular,
< implementation-specific, feature- and extension-laden compiler.
< (Some vendors seem to introduce incompatible extensions
< deliberately, perhaps to lock in market share.)
==========
A: The pointer variable "answer," which is handed to the gets function
as the location into which the response should be stored, has not
< been set to point to any valid storage. It is an uninitialized
< variable, just as is the variable i in
<
< int i;
< printf("i = %d\n", i);
<
< That is, we cannot say where the pointer "answer" points. (Since
---
> been set to point to any valid storage. That is, we cannot say
> where the pointer "answer" points. (Since local variables are not
==========
< overly-long line. (Unfortunately, fgets does not automatically
< delete the trailing \n, as gets would.) It would also be possible
---
> overly-long line. (Unfortunately for this example, fgets does not
> automatically delete the trailing \n, as gets would.) It would
==========
< Since strcat returns the value of its first argument, the s3
< variable is superfluous.
---
> Since strcat returns the value of its first argument (s1, in this
> case), the s3 variable is superfluous.
==========
< invocation used by the caller. In particular, many routines accept
< pointers (e.g. to structs or strings), and the caller usually
< passes the address of some object (a struct, or an array -- see
---
> invocation used by the caller. In particular, many routines which
> accept pointers (e.g. to structs or strings), are usually called
> with the address of some object (a struct, or an array -- see
==========
A: No. Some early man pages for malloc stated that the contents of
< freed memory was "left undisturbed;" this ill-advised guarantee is
< not universal and is not required by ANSI.
---
> freed memory was "left undisturbed;" this ill-advised guarantee was
> never universal and is not required by ANSI.
==========
A: alloca allocates memory which is automatically freed when the
< function from which alloca was called returns. That is, memory
---
> function which called alloca returns. That is, memory allocated
==========
alignment of later fields correct). A field-by-field comparison
would require unacceptable amounts of repetitive, in-line code for
< large structures. Either method would not necessarily "do the
< right thing" with pointer fields: oftentimes, equality should be
< judged by equality of the things pointed to rather than strict
< equality of the pointers themselves.
---
> large structures.
==========
< function to do so. C++ (among other languages) would let you
< arrange for the == operator to map to your function.
---
> function to do so. C++ would let you arrange for the == operator
> to map to your function.
==========
structures), use short. Otherwise, use int. If well-defined
overflow characteristics are important and/or negative values are
< not, use unsigned. (But beware mixtures of signed and unsigned.)
---
> not, use the corresponding unsigned types. (But beware mixtures of
> signed and unsigned.)
==========
< Similar arguments operate when deciding between float and double.
---
> Similar arguments apply when deciding between float and double.
==========
< 67. I can't seem to define a linked list node which contains a pointer
< to itself. I tried
---
> 62. I can't seem to define a linked list successfully. I tried
==========
A: Structs in C can certainly contain pointers to themselves; the
discussion and example in section 6.5 of K&R make this clear. The
< problem is that the example above attempts to hide the struct
< pointer behind a typedef, which is not complete at the time it is
< used. First, rewrite it without a typedef:
<
< struct node
< {
< char *item;
< struct node *next;
< };
<
< Then, if you wish to use typedefs, define them after the fact:
<
< typedef struct node NODE, *NODEPTR;
<
< Alternatively, define the typedefs first (using the line just
< above) and follow it with the full definition of struct node, which
< can then use the NODEPTR typedef for the "next" field.
---
> problem with this example is that the NODEPTR typedef is not
> complete when the "next" field is declared. You will have to give
> the structure a tag ("struct node"), and declare the "next" field
> as "struct node next;".
>
> A similar problem, with a similar solution, can arise when
> attempting to declare a pair of typedef'ed mutually recursive
> structures.
==========
< 68. How can I define a pair of mutually referential structures? I
< tried
<
< typedef struct
< {
< int structafield;
< STRUCTB *bpointer;
< } STRUCTA;
<
< typedef struct
< {
< int structbfield;
< STRUCTA *apointer;
< } STRUCTB;
<
< but the compiler doesn't know about STRUCTB when it is used in
< struct a.
<
< A: Again, the problem lies not in the pointers but the typedefs.
< First, define the two structures without using typedefs:
<
< struct a
< {
< int structafield;
< struct b *bpointer;
< };
<
< struct b
< {
< int structbfield;
< struct a *apointer;
< };
<
< The compiler can accept the field declaration struct b *bpointer
< within struct a, even though it has not yet heard of struct b.
< Occasionally it is necessary to precede this couplet with the empty
< declaration
<
< struct b;
<
< to mask the declarations (if in an inner scope) from a different
< struct b in an outer scope.
<
< Again, the typedefs could also be defined before, and then used
< within, the definitions for struct a and struct b. Problems arise
< only when an attempt is made to define and use a typedef within the
< same declaration.
==========
< 2. Build it up in stages, using typedefs:
---
> 2. Build the declaration up in stages, using typedefs:
==========
A: Several public-domain versions are available. One is in volume 14
< of comp.sources.unix . (Commercial versions may also be available,
< at least one of which was shamelessly lifted from the public domain
< copy submitted by Graham Ross, one of cdecl's originators.) See
< question 96.
---
> of comp.sources.unix . (See question 89.)
==========
< cannot be standardized by the C language. If you are using curses,
< use its cbreak() function. Under UNIX, use ioctl to play with the
---
> cannot be standardized by the C language. Some versions of curses
> have a cbreak() function which does what you want. Under UNIX, use
==========
Operating system specific questions are not appropriate for
comp.lang.c . Many common questions are answered in frequently-
asked questions postings in such groups as comp.unix.questions and
< comp.os.msdos.programmer . Note that the answers are often not
< unique even across different variants of Unix. Bear in mind when
---
> comp.sys.ibm.pc.misc . Note that the answers are often not unique
> even across different variants of a system. Bear in mind when
==========
A: In general, it cannot. Different operating systems implement
< name/value functionality similar to the Unix environment in many
---
> name/value functionality similar to the Unix environment in
==========
different ways. Whether the "environment" can be usefully altered
< by a running program, and if so, how, is entirely system-dependent.
---
> by a running program, and if so, how, is system-dependent.
==========
provide setenv() and/or putenv() functions to do this), and the
modified environment is usually passed on to any child processes,
< but it is _not_ propagated back to the parent process. (The
< environment of the parent process can only be altered if the parent
< is explicitly set up to listen for some kind of change requests.
< The conventional execution of the BSD "tset" program in .profile
< and .login files effects such a scheme.)
---
> but it is _not_ propagated back to the parent process.
==========
stdout is not a terminal. Although the output operation goes on to
< complete successfully, errno still contains ENOTTY. This behavior
< can be mildly confusing, but it is not strictly incorrect, because
< it is only meaningful for a program to inspect the contents of
< errno after an error has occurred (that is, after a library
< function that sets errno on error has returned an error code).
---
> complete successfully, errno still contains ENOTTY.
==========
A: scanf() was designed for free-format input, which is seldom what
you want when reading from the keyboard. In particular, "\n" in a
format string does not mean "expect a newline", it means "discard
< all whitespace". But the only way to discard all whitespace is to
< continue reading the stream until a non-whitespace character is
< seen (which is then left in the buffer for the next input), so the
< effect is that it keeps going until it sees a nonblank line.
---
> all whitespace".
>
> It is usually better to fgets() to read a whole line, and then use
> sscanf() or other string functions to parse the line buffer.
==========
< 84. So what should I use instead?
<
< A: You could use a "%c" format, which will read one character that you
< can then manually compare against a newline; or "%*c" and no
< variable if you're willing to trust the user to hit a newline; or
< "%*[^\n]%*c" to discard everything up to and including the newline.
< Usually the best solution is to use fgets() to read a whole line,
< and then use sscanf() or other string functions to parse the line
< buffer.
==========
A: This problem is, in general, insoluble. Under Unix, for instance,
a scan of the entire disk, (perhaps requiring special permissions)
< would be required, and would fail if the file descriptor were a
< pipe (and could give a misleading answer for a file with multiple
---
> would theoretically be required, and would fail if the file
> descriptor was a pipe or referred to a deleted file (and could give
> a misleading answer for a file with multiple links). It is best to
==========
< A: Variables (and arrays) with "static" duration (that is, those
< declared outside of functions, and those declared with the storage
< class static), are guaranteed initialized to zero, as if the
---
> A: Variables with "static" duration (that is, those declared outside
> of functions, and those declared with the storage class static),
> are guaranteed initialized to zero, as if the programmer had typed
==========
A: The best solution is to use text files (usually ASCII), written
with fprintf and read with fscanf or the like. (Similar advice
< also applies to network protocols.) Be very skeptical of arguments
---
> also applies to network protocols.) Be skeptical of arguments
==========
< If the binary format is being imposed on you by an existing
< program, first see if you can get that program changed to use a
< more portable format.
==========
> A PL/M to C converter was posted to alt.sources in April, 1991.
==========
< Lexeme Corporation
< Richard Cox
< 4 Station Square, #250
< Commerce Court
< Pittsburgh, PA 15219-1119 USA
< (+1) 412 281 5454
==========
< The comp.sources.unix archives also contain converters between
< "K&R" C and ANSI C.
---
> See also question 29.
==========
< 97. Where can I get the winners of old Obfuscated C Contests? When
< will the next contest be held?
<
< A: Send mail to {pacbell,uunet,utzoo}!hoptoad!obfuscate . The contest
< is usually announced in March, with entries due in May. Contest
< announcements are posted in several obvious places. The winning
< entries are archived on uunet (see question 96).
---
> 90. When will the next International Obfuscated C Contest (IOCCC) be
> held? How can I get a copy of the current and previous winning
> entries?
>
> A: The contest typically runs from early March through mid-May. To
> obtain a current copy of the rules, send email to:
>
> {pacbell,uunet,utzoo}!hoptoad!judges or judges at toad.com
>
> Contest winners are first announced at the Summer Usenix Conference
> in mid-June, and posted to the net in July. Previous winners are
> available on uunet (see question 89) under the directory
> ~/pub/ioccc.
==========
The character sequences /* and */ are not special within double-
quoted strings, and do not therefore introduce comments, because a
program (particularly one which is generating C code as output)
< might want to print them. It is hard to imagine why anyone would
< want or need to place a comment inside a quoted string. It is easy
< to imagine a program needing to print "/*".
---
> might want to print them.
==========
For the small fraction of code that is time-critical, it is vital
< to pick a good algorithm; it is much less important to
< "microoptimize" the coding details. Many of the "efficient coding
---
> to pick a good algorithm; it is less important to "microoptimize"
> the coding details. Many of the "efficient coding tricks" which
==========
< degraded. If the performance of your code is so important that you
< are willing to invest programming time in source-level
< optimizations, you would be better served by buying the best
< optimizing compiler you can afford (compilers can perform
< optimizations that are impossible at the source level).
==========
< It is not the intent here to suggest that efficiency can be
< completely ignored. Most of the time, however, by simply paying
< attention to good algorithm choices, implementing them clearly and
< obviously, and avoiding obviously inefficient blunders (i.e. shun
< O(n**3) implementations of O(n**2) algorithms), perfectly
< acceptable results can be achieved.
---
> For more discussion of efficiency tradeoffs, as well as good advice
> on how to increase efficiency when it is important, see chapter 7
> of Kernighan and Plaugher's The Elements of Programming Style, and
> Jon Bentley's Writing Efficient Programs.
==========
Function calls, though obviously incrementally slower than in-line
code, contribute so much to modularity and code clarity that there
< is rarely good reason to avoid them. (Actually, by reducing bulk,
< functions can improve performance.)
---
> is rarely good reason to avoid them.
==========
Among other things, the associative and distributive laws do not
hold completely (i.e. order of operation may be important, repeated
< addition is not necessarily equivalent to multiplication, and
< underflow or cumulative precision loss is often a problem).
---
> addition is not necessarily equivalent to multiplication).
> Underflow or cumulative precision loss is often a problem.
==========
< programming text should cover the basics. (Beware, though, that
< subtle problems can occupy numerical analysts for years.) Do make
< sure that you have #included <math.h>, and correctly declared other
---
> programming text should cover the basics. Do make sure that you
> have #included <math.h>, and correctly declared other functions
==========
including code to handle %e, %f, and %g. It happens that Turbo C's
heuristics for determining whether the program uses floating point
are occasionally insufficient, and the programmer must sometimes
< insert one dummy explicit floating-point operation to force loading
< of floating-point support.
---
> insert a dummy explicit floating-point call to force loading of
> floating-point support.
==========
arrays. Many systems have fixed-size stacks, and those which
< perform dynamic stack allocation automatically (e.g. Unix) are
< often confused when the stack tries to grow by a huge chunk all at
---
> perform dynamic stack allocation automatically (e.g. Unix) can be
> confused when the stack tries to grow by a huge chunk all at once.
>
> (See also question 56.)
==========
frequently come up in. You can find lots of information in the
net.announce.newusers frequently-asked questions postings, the
"jargon file" (also published as _The Hacker's Dictionary_), and
< the official Usenet ASCII pronunciation list, maintained by Maarten
< Litmaath. (The pronunciation list also appears in the jargon file
< under ASCII, as well as in the comp.unix frequently-asked questions
< list.)
---
> the Usenet ASCII pronunciation list.
==========
< available for anonymous ftp, or via a mailserver. (Note that the
< size of the list is monotonically increasing; older copies are
< obsolete and don't contain much, except the occasional typo, that
< the current list doesn't.)
---
> available for anonymous ftp, or via a mailserver.
>
> This list is an evolving document, not just a collection of this
> month's interesting questions. Older copies are obsolete and don't
> contain much, except the occasional typo, that the current list
> doesn't.
==========
> Jon Louis Bentley, Writing Efficient Programs, Prentice-Hall,
> 1982, ISBN 0-13-970244-X.
Steve Summit
scs at adam.mit.edu
More information about the Comp.lang.c
mailing list