ANSI C -- miscellaneous suggestions
Doug Gwyn
gwyn at brl-smoke.ARPA
Tue Dec 16 05:40:02 AEST 1986
In article <112 at decvax.UUCP> minow at decvax.UUCP (Martin Minow) writes:
>Page 1, line 14. The standard should specify the total list of words
>reserved to the compiler and its libraries.
While this would be "nice", one can pretty much find this out
from the index, and the standard isn't intended to be either a
tutorial or a user reference manual. I would hope that vendors
and textbook authors will consider providing such a list.
>Page 6, line 40ff. It is unclear whether the main() function may be
>declared or invoked with more than 2 parameters.
I thought this was clear: main() can be defined with either 0 or 2
parameters. Other schemes are not defined, which allows extensions
such as UNIX's envp but does not mandate them for all implementations.
(Note that envp is not normally necessary, given getenv().)
>Page 7, line 12. Must the string in argv[0] be modifiable?
That's what the draft says. Is this a problem?
>Page 10, 27. The horizontal tab, vertical tab, and form feed characters
>are not needed by the language. The standard should declare that
>horizontal tab is identical to space except in character and string
>literals, and vertical tab and form feed are everywhere identical to newline.
There are several flavors of whitespace in C (including the
preprocessor). Some generalization was done where possible;
did we miss any?
>Page 11, lines 29ff. The standard should specify the internal representations
>for the predefined escape sequences for implementations that use the
>USASCII or Latin 1 alphabets,
So long as we don't mandate ASCII/ISO character sets, this is
infeasible.
>Page 12, line 29. The minimum significance for external identifiers
>should be changed to ``6 significant monocase initial characters in
>an external identifier.''
Section 3.1.2 permits implementations to ignore case distinctions.
2.2.4.1 is merely to establish that at least 6 significant characters
can be used in external identifiers simultaneously with meeting other
implementation limit requirements, and nothing is gained by mentioning
case-mapping in this context.
>Page 14, line 20ff. FLT should be FLOAT. DBL should be DOUBLE. etc. As
>the first 31 characters of macro definitions are significant, there is no
>need to sacrifice legibility (and maintainability) for consiseness.
That would be nice, but we also have SHRT_MAX, for example, which
is defined in a header that is shared between two standards bodies
and is therefore difficult to redefine. (It's also possible that these
names were chosen to agree with the new Fortran standard; I forget.)
>Page 26, line 13. The exceptions (the characters that may not appear
>in string literals) should include the vertical tab character
>and the form feed character, as these are equivalent to newlines.
Where are these characters declared to be "equivalent to newlines"?
>Page 74, line 28. Horizontal tab does not have an independent existance
>during preprocessing. The example should note that comments may preceed
>or follow the # that introduces a preprocessing directive.
Section 2.1.1.2 (Translation phases) states that an implementation
MAY retain distinct white-space characters at the point of
preprocessing. However, comments must have been turned into
single space characters at that point.
>Page 75, line 36. An arithmetic error in an #if expression (such as divide
>by zero) shall result in a diagnostic error message. However, a sequence
>such as:
>
> #if (foo == 0) ? 0 : (10 / foo)
>
>should not result in a diagnostic error message for any value of foo.
[The page/line reference seems wrong.] I think the error handling
is already implied by the syntax, but perhaps explicit wording
would help. (Note that the example is correct code and should
not cause a diagnostic in any case.)
>Page 82, line 24ff. I would recommend the following clarifications to
>the definition of the predefined macro names:
>
> __LINE__ The line number shall be as defined in section 3.8.4,
> page 81, line 30.
That's already my understanding of the draft.
> __FILE__ There is no presumption that this string can be used to
> open a file during execution of the program.
That's the way it is now. The sources clearly need not even exist
in the run-timem environment!
> __DATE__ Neither this value nor the value of __TIME__ change during
> compilation.
That might be nice, but how important is such a constraint on
implementations? I bet there even are people who would prefer
the __TIME__ clock to continue to tick during compilation.
>A predefined name should be redefinable (by #undef). (The identifier
>"defined" may not be redefined.)
No, since these names begin with underscore, the user cannot safely
redefine them anyway; they're not in his "allowable name space".
>Page 83, line 15ff. Function prototypes with separate parameter identifier
>and declaration lists offer a better environment for documentation than
>the more concise function prototype format. I would strongly recommend
>that they not be marked obsolescent.
The intent is to eliminate any requirement that old-style function
parameter declarations be supported in a future revision of the
standard. The only way (it appears) that we can do that is by
calling them "obsolescent" in a previous draft.
>Page 85, line 35. The ability to redefine any function declared in
>a header as a macro may break existing programs that write, e.g.,
>
> #include <stdlib.h>
> extern long rand();
>
>If rand() is declared as a macro,
First of all, I doubt that existing programs #include <stdlib.h>.
When adding such an #include to existing source, you should also
remove any explicit redundant declarations (except when they are
really necessary, in which case use #undef or one of the other
usual tricks to force use of a genuine function).
I'll be among the first to admit that this approach has its
problems, but I don't know of anything better. If you can
suggest a better way to handle this, please write it up and
mail it in to ANSI.
>Page 89, line 13 (footnote 64): The Standard should note that, in an
>implementation that uses the Latin 1 character set, the printing
>characters are those whose values lie from 0x20 through 0x7E or from
>0xA0 through 0xFF. Control characters are those whose values lie from
>0x00 through 0x1F, 0x7F, or from 0x80 through 0x9F. The ranges for the other
><ctype.h> macros should be similarly extended.
>
>Page 91, line 46ff. Note that, in a Latin 1 environment, the ispunct() and
>isspace() functions should test for the non-breaking space at 0xA0.
No particular character set is required, so we can't make such
remarks in the standard itself. Perhaps the Rationale should
give such examples.
>Page 102, line 46ff. If longjmp() is called from a signal handler, volatile
>objects may have indeterminate values as they cannot always be updated by
>atomic (one machine cycle) operations. It is unrealistic to require an
>implementation to lock interrupts before modifying a volatile object. The
>Standard should note that volatile objects are indeterminate when longjmp()
>is called from an interrupt or signal handler.
I don't know that anything needs to be said about this. The only
object for which atomic operations is guaranteed is sig_atomic_t.
[longjmp() vs. signal handlers was discussed in a previous note]
>Page 128, line 7. Is one character of pushback guaranteed even before
>anything has been read from the stream or after end of file or error?
>The standard should be clarified on this point. (I don't care either way,
>but would prefer permitting one character pushback at any time.)
Yes, since this is not specifically excepted it is required.
>Page 140, line 21ff. Predefined values for "successful termination"
>and "unsuccessful termination" (argumemts to exit()) should be provided.
Done at last week's meeting, via a compromise solution that
requires that 0 also always be taken to mean success.
>Page 142, line 16ff. An unsigned division function analogous to
>div() would be useful.
This keeps getting proposed and defeated. Basically, the only
reason div() etc. are defined is because we didn't want to
insist that / and % work "correctly"; that's not an issue for
unsigned integers. (It's also nice that both the quotient and
remainder are returned simultaneously; this can be exploited by
some implementations to improve efficiency in the frequent
situation where both values are needed.)
A lot of proposals for new features have been rejected in an
attempt to keep the size of the language and its environment
relatively small. (This attempt hasn't been totally successful,
but it's certainly a worthwhile goal.) Therefore, please don't
interpret failure to adopt a suggestion as necessarily implying
that there is something wrong with the idea, although often
there is (in which case the response should point out what).
Reminder: Current public review period ends 07-Mar-1986.
There WILL be another, 2-month, public review, since X3J11
has decided to make substantive changes to the current draft
[as reported in another note].
More information about the Comp.lang.c
mailing list