Comments on UNIX command option syntax
Gary Perlman
perlman at wanginst.UUCP
Thu Oct 31 04:03:01 AEST 1985
I tried to post this to mod.std.unix, but it got bounced by the mailer. Sigh.
------------------
Proposed Syntax Standard
For UNIX* System Commands
RULE 1: Command names must be between 2 and 9 characters.
RULE 2: Command names must include lower case letters and
digits only.
RULE 3: Option names must be a single character in length.
RULE 4: All options must be delimited by ``-''.
RULE 5: Options with no arguments may be grouped behind
one delimiter.
RULE 6: The first option-argument following an option
must be preceded by white space.
RULE 7: Option arguments cannot be optional.
RULE 8: Groups of option-arguments following an option must be
separated by commas or separated by white space and quoted.
RULE 9: All options precede operands on the command line.
RULE 10: ``--'' may be used to delimit the end of the options.
RULE 11: The order of options relative to one another
should not matter.
RULE 12: The order of operands may matter and position-related
interpretations should be determined on a
command-specific basis.
RULE 13: ``-'' preceded and followed by white space should be used
only to mean the standard input.
November 1983
*UNIX is a trademark of AT&T Bell Laboratories
--------------------------------------------------------------------
The above is a direct quote of the quick reference card
handed out in conjunction with a talk at the 1984 Winter
Usenix conference by K. Hemenway & H. Armitage. This set of
rules is sometimes called the H&A standard. Any proposal of
a standard is going to cause controversy, and this is no
exception. Although I at first was opposed to the standard,
I came to appreciate the thought that went into it. In this
commentary, I hope to convey that to you.
General comments: The H&A standard tries to maintain as much
compatibility with existing programs while improving the
consistency of UNIX command line syntax. This is much
harder than designing a command line syntax from scratch.
It is important to understand the rationale behind the whole
set of conventions before making judgements about them
individually.
H&A recorded the syntax for all the commands in UNIX (at
least System V UNIX). They tried to come up with a standard
that was as close to most of the existing commands as
possible. Their analysis, summarized in their USENIX paper,
but much better covered in an unavailable internal Bell Labs
tech report, is an excellent example of backing statements
with facts. The most common example of an objection to the
standard is of the form, "I don't like RULE X. What about
the zz command?" to which H&A could say, "That exception
happens in only N (few) commands." Here are my comments
about the rules. They contain my reaction to the rules and
some of H&A's reasons for the rules.
I want to start by saying that this standard is much better
than no standard. If I know that a command follows the
standard, then there are no surprises about how options are
requested and that makes life easier for me. I don't have
to worry about inconsistency, and that overwhelms the
quirks of the standard.
RULE 1:
I see no reason for not allowing single character
command names like e, f, w, and S, but there are
not many of these. There is not much mnemonic
value to single character commands, nor for 2
character commands, but there are a lot of those.
RULE 2:
The restriction to lower case letters only is for
case insensitive systems. One notable exception
is a.out, but that is not really a command name.
Not allowing special characters like underscore
simplifies the rules.
RULE 3:
Single character option names are not very
mnemonic, but they are necessary to be able to
bundle options. They are also used in most of the
commands. Their lack of mnemonicity is
compensated somewhat when on-line help is readily
available, which unfortunately is not common.
RULE 4:
The convention of preceding options with - started
to distinguish options from file names. Some
commands that do not take operands like files or
expressions do not require the - sign. My
experience is that this is an extra rule to
explain to new users that is not worth saving a
keystroke here and there.
RULE 5:
Bundling of options was a rule demanded by UNIX
fans inside Bell Labs. Once you accept this rule,
you can't have multiple character options, and
this is unfortunate. Still, I would not like to
have to type: ls -l -t -r.
RULE 6:
Many programs require that an option argument
immediately follow the option (e.g., cc -lm, nroff
-man) while some require a space (e.g., cc -o
pgm). This is one inconsistency that causes the
most problems for me, especially when there are
inconsistencies inside a command (cf. cc, which
passes the tightly grouped option-arguments to
other programs). Rather than deciding on
no-space, a space is required in the H&A standard.
This is to make sure that filename expansion works
properly. For example, if the argument to an
option is a file like "extralongname", then the
option -fextra* would not work, while having a
space in there would. You could make the syntax
"space-optional" but that would require that the
documentation cover more than one case, which I
argue would make the syntax harder to learn.
RULE 7:
Because option arguments must be separated from
options, there is no way to make an option
argument optional, except for the special case of
at the end of a command line with no operands (but
I think this rare exception would be hard to
explain). There are few commands that allow
optional option-arguments (e.g., pr -h), and
supplying a null argument (ie. "") works are
well.
RULE 8:
This rule does not allow for syntax like:
pgm -i file1 file2 file3 -o file4 file5
but this is not very common. Placing quotes
around the files is not too bad.
RULE 9:
When options must precede operands (e.g., files)
several practices are not supported. One is
choosing a set of options for one file and then
some options for another. Instead of this, two or
more command lines are needed, but this is not a
serious penalty for most commands, and not a
common need. The second unsupported practice is
that of thinking of options after typing most of
the command line; if options must precede
operands, then they must be inserted. While this
can be awkward for some primitive shells, it is
best handled with command line editing, such as
that in ksh.
RULE 10:
You really need -- to delimit the end of options
so that files or expressions that begin with - can
be processed. The string -- was used because of
getopt's use. This is not a strong motivation,
because at the time of the standard, only about 40
commands used getopt. Still, it seems as good a
delimiter as any.
RULE 11:
I do not know why the order of options should not
matter. It does matter in commands like cc (ie.
ld) that requires a special ordering to libraries.
RULE 12:
This rule says that the programmer can choose any
meaning to what follows the options. Makes sense
to me.
RULE 13:
There is some tradition and a definite need to be
able to insert the standard input into a list of
files. The - has been used in a few commands, and
there were no likely contenders.
My impression is that the H&A standard is one we can live
with. It is not the sort of syntax that someone might
design from scratch, but there is a need for compatibility
with old syntax, not just for user comfort, but also to
avoid breaking thousands of shell scripts and system(3)
calls to UNIX command lines. Yes, there is some more typing
required, but I think it is not a high price to pay for a
set of conventions you can fit on a small card. To get all
the time-savers we like, the syntax gets much more
complicated, which I think is one reason for the bad
reputation UNIX has earned.
What about existing commands? Last I heard, the plan was to
first work on the easy cases, the commands that were very
close to the standard. Some commands would not be changed
but be replaced by new programs that would phase out the
old. Some examples of commands with extremely difficult
syntax are "pr" and "sort". The "test" command is finessed
by saying that it does not use options, but a special
expression language. The "find" command could be dealt with
similarly. The "dd" command, with name-value format
options, originally designed as a parody of the IBM DD,
would not change.
--
Gary Perlman Wang Institute Tyngsboro, MA 01879 (617) 649-9731
UUCP: decvax!wanginst!perlman CSNET: perlman at wanginst
More information about the Comp.unix
mailing list