The Answer to All Man's Problems (part 4 of 6)
Tom Christiansen
tchrist at convex.COM
Tue Jan 8 09:22:02 AEST 1991
Xsection would be recognized, a
X.B manq
Xdirectory would not be, and while a
X.B man3f
Xdirectory would be recognized, a
X.B man3x11
Xdirectory would not be.
X.PP
XLikewise, the possible subsections
Xfor a man page were also embedded in the source code, so
Xa man page named something like
X.I /usr/man/man3/XmLabel.3x11
Xwould not be found because
X.B 3x11
Xwas not in the hard-coded list of viable subsections.
XSome systems install all man pages stripped of subsection
Xcomponents in the file name. This situation is less than optimal
Xbecause it proves useful to be able
Xto supply both a
X.M getc 3f
Xand a
X.M getc 3s .
XDistinguishing between subsections is
Xparticularly convenient with the ``intro'' man pages;
Xa vendor could supply
X.M intro 3
X.M intro 3a ,
X.M intro 3c ,
X.M intro 3f ,
X.M intro 3m ,
X.M intro 3n ,
X.M intro 3r ,
X.M intro 3s ,
Xand
X.M intro 3x
Xas introductory man pages for the various libraries.
XHowever, the task of running
X.M access 2
Xon all possible subsections is slow and tedious, requiring
Xrecompilation whenever a new subsection is invented.
X.NH
XReferences in the Filesystem
X.PP
XThe existing man system had no elegant way to handle
Xman pages containing more than one entry. For example,
X.M string 3
Xcontains references to
X.M strcat 3 ,
X.M strcpy 3 ,
Xamongst others. Because the \fIman\fP program looks for
Xentries only in the file system, these extra references must be
Xrepresented as files that reference the base man page. The most
Xcommon practice is to have a file consisting of
Xa single line
Xtelling
X.I troff
Xto source the other man page.
XThis file would read something like:
X.sp
X.ti 5
X.CW
X\&.so man3/string.3
X.CE
X.sp
XOccasionally,
Xextra references are created with a link in the file
Xsystem (either a hard link or a symbolic one). Except when
Xusing
Xhard links, this method wastes
Xdisk blocks and inodes. In any case,
Xthe directory gains more entries, slowing
Xdown accesses to files in those directories. Logic
Xmust be built into the \fIman\fP program to
Xdetect these extra references.
XIf not, when man pages are reformatted into their
Xcat directories, separate formatted man pages are stored
Xon disk, wasting substantial amounts of disk space
Xon duplicate information.
XOn systems with numerous man pages, the directories can grow
Xso large that all man
Xpages for a given section cannot be listed on the command line
Xat one time because of kernel restrictions on the total length of the
Xarguments to
X.M exec 2 .
XBecause of the need to store reference information
Xin the file system, the problem is only made worse.
XThis often happens in
Xsection 3 after the man pages for the X
Xlibrary have been installed, but
Xcan occur in other sections as well.
X.PP
XThe
X.M makewhatis 8
Xprogram is a Bourne shell script that generates the
X.I /usr/lib/whatis
Xindex, and is used by
X.M apropos 1
Xand
X.M whatis 1
Xto provide one-line summaries of man pages. These
Xprograms are part of the
X.I man
Xsystem
Xand are often links to each other and sometimes to
X.I man
Xitself.
XIf any of
Xthe man subdirectories contain more files than the shell
Xcan successfully expand on the command line, the
X.I makewhatis
Xscript fails
Xand no index is generated. When this occurs,
X.I whatis
Xand
X.I apropos
Xstop working. The
X.M catman 8
Xprogram, used to pre-format raw man pages, suffers
Xfrom the same problem.
X.PP
XOf course,
X.I makewhatis
Xwasn't working all that well, anyway.
XIt was a wrapper around many calls to little programs
Xthat each did a small piece of the work, making it
Xrun slowly.
XIt, too, had a hard-coded pathname for where man pages resided
Xon disk and which sections were permitted.
X.I Makewhatis
Xdidn't always extract the proper information
Xfrom the man page's \s-1NAME\s0
Xsection. When it did, this information was sometimes
Xgarbled due to embedded
X.I troff
Xformatting information.
XBut even garbled information was better
Xthan none at all.
XEven so, these programs left some things to be desired.
X.I Apropos
Xdidn't understand regular expression searches, and both
Xit and
X.I whatis
Xpreferred to do their own lookups using basic, unoptimized C functions
Xlike
X.M index 3
Xrather than using a general-purpose optimized string search program
Xlike
X.M egrep 1 .
X.NH
XThe Solution
X.NH 2
XA Real Database
X.PP
XThe problem in all these cases appeared to be that the filesystem
Xwas being used as a database, and that this paradigm did not hold
Xup well to expansion. Therefore the solution was to move
Xthis information into a database for more rapid access.
XUsing this database,
X.I man
Xand
X.I whatis
Xneed no longer call
X.M access 2
Xto test all possible locations for the desired man page.
XTo solve the other problems,
X.M makewhatis 8
Xwould be recoded so it didn't rely on the shell
Xfor looking at directories.
X.NH 2
XCoding in Perl
X.PP
XWhen the project was first contemplated, the
Xperl programming language by Larry Wall was rapidly
Xgaining popularity as an alternative to C for tasks that
Xwere either too slow when written as shell scripts, or
Xsimply exceeded the shells' somewhat limited capabilities.
XSince perl was
Xoptimized for parsing text, had convenient
X.M dbm 3x
Xsupport built in to it, and the task really didn't seem complex
Xenough to merit a full-blown treatment in C or C++,
Xperl was selected as the language of choice.
XHaving all code written in perl would also help support
Xheterogeneous environments because the resulting scripts could
Xbe copied and run on any hardware or software platform supporting
Xperl. No recompilation would be required.
X.PP
XSome concern existed about choosing
Xan interpreted language when one of the issues to address was
Xthat of speed. It was decided to do the prototype in perl
Xand, if necessary, translate this into C should performance
Xprove unacceptable.
X.PP
XThe first task was to recode
X.M makewhatis 8
Xto generate the new
X.I whatis
Xdatabase using \fIdbm\fP. The
X.M directory 3
Xroutines were used rather than shell globbing to circumvent
Xthe problem of large directories breaking shell wildcard
Xexpansions. Perl proved to be an appropriate choice for this
Xtype of text processing (see Figure 1).
X.BF "\fImakewhatis\fP excerpt #1"
Xs/\e\ef([PBIR]|\e(..)//g; # kill font changes
Xs/\e\es[+-]?\ed+//g; # kill point changes
Xs/\e\e&//g; # and \e&
Xs/\e\e\e((ru|ul)/_/g; # xlate to '_'
Xs/\e\e\e((mi|hy|em)/-/g; # xlate to '-'
Xs/\e\e\e*\e(..//g && # no troff strings
X print STDERR "trimmed troff string macro in NAME section of $FILE\en";
Xs/\e\e//g; # kill all remaining backslashes
Xs/^\e.\e\e"\es*//; # kill comments
Xif (!/\es+-+\es+/) {
X # ^ otherwise L-devices would be L
X print STDERR "$FILE: no separated dash in $_\en";
X $needcmdlist = 1; # forgive their braindamage
X s/.*-//;
X $desc = $_;
X} else {
X ($cmdlist, $desc) = ( $`, $' );
X $cmdlist =~ s/^\es+//;
X}
X.EF
X.NH 2
XDatabase Format
X.PP
XThe database entries themselves are conveniently
Xaccessed as arrays from perl. To save space and
Xaccommodate man pages with multiple references, two
Xkinds of database entries exist: direct and indirect.
XIndirect entries are simply references to direct entries.
XFor example, indirect entries for
X.M getc 3s ,
X.M getchar 3s ,
X.M fgetc 3s ,
Xand
X.M getw 3s
Xall point to the real entry, which is
X.M getc 3s .
XIndirect entries are created for multiple entries in
Xthe \s-1NAME\s0 section, for symbolic and hard links, and
Xfor
X.B \&.so
Xreferences. Using the \s-1NAME\s0 section is the preferred
Xmethod; the others are supported for backwards compatibility.
X.PP
X.ne 4
XAssuming that the \s-1WHATIS\s0 array has been bound to the
Xappropriate
X.I dbm
Xfile, storing indirect entries is trivial:
X.sp
X.CW
X.ti 1i
X$WHATIS{'fgetc'} = 'getc.3s';
X.sp
X.CE
XWhen a program encounters an indirect entry, such as
Xfor \fIfgetc\fP, it must make another lookup based on
Xthe return value of first lookup (stripped of its
Xtrailing extension) until it finds a direct entry. The
Xtrailing extension is kept so that an indirect reference
Xto
X.M gtty 3c
Xdoesn't accidentally pull out
X.M stty 1
Xwhen it really wanted
X.M stty 3c .
X.PP
XThe format of a direct entry is more complicated, because
Xit needs to encode the description to be used by
X.M whatis 1
Xas well as the section and subsection information.
XIt can be distinguished from an indirect entry because
Xit contains four fields delimited by control-A's (\s-1ASCII 001\s0),
Xwhich are themselves prohibited from being in any
Xof the fields. The fields are as follows:
X.br
X.in +5n
X.IP 1
XList of references that point to this man page; this
Xis usually everything to the left of the hyphen
Xin the \s-1NAME\s0 section.
X.IP 2
XRelative pathname of the file the man page is kept in;
Xthis is stored for the indirect entries.
X.IP 3
XTrailing component of the directory in which the
Xman page can be found, such as
X.B 3
Xfor \fBman3\fP.
X.IP 4
XDescription of the man page for use by
Xthe
X.I whatis
Xand
X.I apropos
Xprograms; basically everything to the right of the hyphen in the
XN\s-1AME\s0 section.
X.in -5n
X.PP
XAt first glance, the third field would
Xseem redundant. It would appear that you could
Xderive it from the character after the dot in the second field.
XHowever, to support arbitrary subdirectories like
X.B man3f
Xor
X\fBman3x11\fP, you must also know the name of the
Xdirectory so you don't look in
X.B man3
Xinstead. Additionally, a long-standing tradition exists
Xof using the
X.B mano
Xsection
Xto store old man pages from arbitrary sections.
XFurthermore, man pages are sometimes installed in the
Xwrong section. To support these scenarios, restrictions
Xregarding the format of filenames used for man pages were
Xrelaxed in \fIman\fR,
X\fImakewhatis\fR, and \fIcatman\fR,
Xbut warnings would be issued by
X.I makewhatis
Xfor man pages installed in directories that don't have
Xthe same suffix as the man pages.
X.NH 2
XMultiple References to the Same Topic
X.PP
XA problem arises from the fact that the same topic
Xmay exist in more than one section of the manual.
XWhen a lookup is performed on a topic,
Xyou want to retrieve all possible man page locations
Xfor that topic. The
X.I whatis
Xprogram wants to display them all to the user, while
Xthe
X.I man
Xprogram will either show all the man pages
X(if the
X.B \-a
Xflag is given) or
Xsort what it has retrieved according to a particular section and
Xsubsection precedence, by default showing entries from section
X1 before those from section 2, and so forth. Therefore,
Xeach lookup may actually return a list of direct and
Xindirect lookups. This list is delimited by control-B's
X(\s-1ASCII 002\s0), which are stripped from the data fields, should
Xthey somehow contain any. The code for storing a direct entry
Xin the
X.I whatis
Xdatabase is featured in Figure 2.
X.BF "\fImakewhatis\fP excerpt #2"
Xsub store_direct {
X local($cmd, $list, $page, $section, $desc) = @_; # args
X local($datum);
X
X $datum = join("\e001", $list, $page, $section, $desc);
X
X if (defined $WHATIS{$cmd}) {
X if (length($WHATIS{$cmd}) + length($datum) + 1 > $MAXDATUM) {
X print STDERR "can't store $page -- would break DBM\en";
X return;
X }
X $WHATIS{$cmd} .= "\e002"; # append separator
X }
X $WHATIS{$cmd} .= $datum; # append entry
X}
X.EF
X.KE
X.PP
XNotice the check of the new datum's
Xlength against the value of \s-1MAXDATUM.\s0 This is because of the
Xinherent limitations in the implementation of the
X.M dbm 3x
Xroutines. This is 1k for
X.I dbm
Xand 4k for
X.I ndbm .
XThis restriction will be relaxed
Xif a \fIdbm\fR-compatible set of routines is written without
Xthese size limitations. The \s-1GNU\s0
X.I gdbm
Xroutines hold promise, but they were released after the
Xwriting of these programs and haven't been investigated yet.
XIn practice, these limits are seldom if ever reached, especially
Xwhen
X.I ndbm
Xis used.
X.NH
XOther Problems, Other Solutions
X.PP
XThe rewrite of
X.I makewhatis ,
X.I catman ,
Xand
X.I man
Xto understand multiple man trees and to use a database
Xfor topic-to-pathname mapping
Xdid much to alleviate the most important problems
Xin the existing man system, but several minor problems
Xremained. Since this was a complete rewrite of the entire
Xsystem, it seemed an appropriate time to address these as well.
X.NH 2
XIndexing Long Pages
X.PP
XSeveral of the most frequently consulted man pages on the system
Xhave grown beyond the scope of a quick reference guide,
Xinstead filling the function of a detailed user manual.
XMan pages of this sort include those for shells, window
Xmanagers,
Xgeneral purpose
Xutilities such as awk and perl,
Xand the \s-1X11\s0 man pages.
XAlthough these man pages
Xare internally organized into sections and subsections that
Xare easily visible on a hard-copy printout, the on-line
Xman system could not recognize these internal
Xsections. Instead, the user was forced to search through pages
Xof output looking for the section of the man page containing
Xthe desired information.
X.PPe
XTo alleviate this time-consuming tedium, the man program
Xwas taught to parse the
X.I nroff
Xsource for man pages in order to build up an index of these sections
Xand present them to the user on demand.
XSee Figure 3 for an excerpt from the
X.M ksh 1
Xindex page, displayable via the new
X.B \-i
Xswitch.
X.BF "\fIksh\fP index excerpt"
XIdx Subsections in ksh.1 Lines
X 1 NAME 3
X 2 SYNOPSIS 22
X 3 DESCRIPTION 15
X 4 Definitions. 43
X 5 Commands. 338
X 6 Comments. 6
X 7 Aliasing. 107
X 8 Tilde Substitution. 47
X 9 Command Substitution. 28
X10 Process Substitution. 49
X11 Parameter Substitution. 645
X12 Blank Interpretation. 15
X13 File Name Generation. 87
X.EF
X.PP
XThe
X.I /usr/man/idx*/
Xdirectories
Xserve the
Xsame function for saved indices
Xas
X.I /usr/man/cat*/
Xdirectories do for saved formatted man pages.
XThese are regenerated as needed according the
Xthe same criteria used to regenerate the cat pages.
XThey can be used to index into a given man page or
Xto list a man page's subsections.
XTo begin at a given subsection, the user appends
Xthe desired subsection to the name of the man page
Xon the command line,
Xusing a forward slash as a delimiter. Alternatively,
Xthe user can just supply a trailing slash on the man page
Xname, in which case they are presented with the index listing
Xlike the one the
X.B \-i
Xswitch provides, then prompted for the section
Xin which they are interested. A double slash indicates
Xan arbitrary regular expression, not a section name.
XThis is merely a short-hand notation for first running
Xman and then typing
X.CW
X/expr
X.CE
Xfrom within the user's pager.
XSee Figure 4
Xfor example usages of the indexing features.
X.BF "Index Examples"
Xman -i ksh # show sections
Xman ksh/ # show sections, prompt for which one
X
Xman ksh/tilde
Xman ksh/8 # equivalent to preceding line
X
Xman ksh/file
Xman ksh/generat # equivalent to preceding line
Xman ksh/13 # so is this
X
Xman ksh//hangup # start at this string
X.EF
X.PP
XThis indexing scheme is implemented by searching the index stored in
X.I /usr/man/idx1/ksh.1
Xif it exists, or generated dynamically otherwise,
Xfor the requested subsection. A numeric subsection is
Xeasily handled. For strings, a case-insensitive
Xpattern match is first
Xmade anchored to the front of the string, then \(em failing
Xthat \(em anywhere in the section description. This way
Xthe user doesn't need to type the full section title.
XThe
X.I man
Xprogram starts up the pager with a
Xleading argument to begin at that section. Both
X.M more 1
Xand
X.M less 1
Xunderstand this particular notation.
XIn the first
Xexample given above, this would be
X.sp
X.CW
X.ti +.5i
Xless '+/^[ \et]*Tilde Substitution' /usr/man/cat1/ksh.1
X.sp
X.CE
X.PP
XOnce again, perl proved
Xuseful for coding this algorithm concisely. The
Xsubroutine for doing this is given in
XFigure 5. Given an expression such as ``5''
Xor ``tilde'' or ``file'' and a pathname of the man
Xpage,
X.I man
Xloads
Xan array of subsection
Xindex titles and quickly retrieves the proper
Xheader to pass on to the pager. Perl's built-in
X.B grep
Xroutine for selecting from arrays those elements
Xconforming to certain criteria made the coding easy.
X.BF "Locate Subsection by Index"
Xsub find_index {
X local($expr, $path) = @_; # subroutine args
X local(@matches, @ssindex);
X @ssindex = &load_index($path);
X
X if ($expr > 0) { # test for numeric section
X return $ssindex[$expr];
X } else {
X if (@matches = grep (/^$expr/i, @ssindex)) {
X return $matches[0];
X } elsif (@matches = grep (/$expr/i, @ssindex)) {
X return $matches[0];
X } else {
X return '';
X }
X }
X}
X.EF
X.NH 2
XConditional Tbl and Eqn Inclusion
X.PP
XSeveral other relatively minor enhancements were made
Xto the man system in the course of its rewrite.
XOne of these
Xwas to include calls to
X.M eqn 1
Xand
X.M tbl 1
Xwhere appropriate. For instance, the \s-1X11\s0 man pages use
X.I tbl
Xdirectives to construct a number of tables.
XIt was not sufficient to supply
Xthese extra filters for all man pages. Besides the
Xslight performance degradation this would incur, a
Xmore serious problem exists: some systems have man pages that
Xcontain embedded
X.LB .TS
Xand
X.LB .TE
Xdirectives; however, the data between them was not
X.I tbl
Xinput, but rather its output. They have already
Xbeen pre-processed in the unformatted versions.
XTo do so again causes
X.I tbl
Xto complain bitterly, so heuristics to check for this condition
Xwere built in to the function that determines which filters
Xare needed.
X.PP
XTo support tables and equations in man pages when viewed on-line,
Xthe output must be run through
X.M col 1
Xto be legible. Unfortunately, this strips the man pages
Xof any bold font changes, which is undesirable because it is
Xoften important to distinguish between bold and italics for
Xclarity. Therefore, before the formatted man page is fed to
X\fIcol\fP, all text in bold (between escape sequences)
Xis converted to character-backspace-character combinations. These
Xcombinations
Xcan be recognized by the user's pager as a character in
Xa bold font, just as underbar-backspace-character is recognized
Xas an italic (or underlined) one. Unfortunately, while
X.I less
Xdoes recognize this convention,
X.I more
Xdoes not. By storing the formatted versions with all escape-sequences
Xremoved, the user's pager can be invoked without a pipe to
X.I ul
Xor
X.I col
Xto fix the reverse line motion directives. This provides the pager with
Xa handle on the pathname of the cat page, allowing users to back up
Xto the start of man pages, even exceptionally long ones, without exiting the
X.I man
Xprogram. This would not be feasible if the pager were being fed
Xfrom a pipe.
X.NH 2
XTroffing and Previewing Man Pages
X.PP
XNow that many sites have high-quality laser printers
Xand bit-mapped displays, it seemed desirable for
X.I man
Xto understand how to direct
X.I troff
Xoutput to these. A new option, \fB-t\fR,
Xwas added to mean that
X.I troff
Xshould be used instead of
X\fInroff\fR.
XThis way users can easily get pretty-printed versions of
Xtheir man pages.
X.PP
XFor workstation or X-terminal users,
X.I man
Xwill recognize
Xa \s-1TROFF\s0 environment variable or
Xcommand line argument to indicate an
Xalternate program to use for typesetting.
X(This presumes that the program recognizes
X.I troff
Xoptions.) This method often produces more legible output
Xthan
X.I nroff
Xwould, allows the user to stay in their office, and saves
Xtrees as well.
X.NH 2
XSection Ordering
X.PP
XThe same topic can occur in more than one section of
Xthe manual, but
Xnot all users on the system want the same default
Xsection ordering that
X.I man
Xuses to sort these possible pages.
XFor instance,
XC programmers who want to look up the man page for
X.M sleep 3
Xor
X.M stty 3
Xfind that by default,
X.I man
Xgives them
X.M sleep 1
Xand
X.M stty 1
Xinstead. A \s-1FORTRAN\s0 programmer may want to see
X.M system 3f ,
Xbut instead gets
X.M system 3 .
XTo accommodate these needs, the
X.I man
Xprogram will honor a \s-1MANSECT\s0 environment
Xvariable (or a
X.B \-S
Xcommand line switch) containing a list of section suffixes.
XIf subsection or multi-character section ordering
Xis desired, this string should be colon-delimited.
XThe default ordering is ``ln16823457po''.
XA C programmer might set his \s-1MANSECT\s0 to be ``231'' instead to access
Xsubroutines and system calls before commands of the same name.
XA \s-1FORTRAN\s0 programmer might prefer ``3f:2:3:1'' to get
Xat the \s-1FORTRAN\s0 versions of subroutines before the standard
XC versions.
XSections absent from the \s-1MANSECT\s0 have a sorting priority
Xlower than any that are present.
X.NH 2
XCompressed Man Pages
X.PP
XBecause man pages are \s-1ASCII\s0 text files, they stand to benefit from
Xbeing run through the
X.M compress 1
Xprogram.
XCompressing man pages
Xtypically yields disk space savings of around 60%.
XThe start-up time for decompressing the man page when
Xviewing is not enough to be bothersome. However, running
X.I makewhatis
Xacross compressed man pages takes significantly longer
Xthan running it over uncompressed ones, so some sites may wish to
Xkeep only the formatted pages compressed, not the unformatted
Xones.
X.PP
XTwo different
Xways of indicating compressed man pages seem to exist
Xtoday. One is where the man page itself has an attached
X.B .Z
Xsuffix, yielding pathnames like
X\fI/usr/man/man1/who.1.Z\fR.
XThe other way is to have
Xthe section directory contain the
X.B .Z
Xsuffix
Xand have the files named normally, as in
X\fI/usr/man/man1.Z/who.1\fR.
XEither strategy is supported to ease porting
Xthe program to other systems.
XAll programs dealing with man pages have been updated to
Xunderstand man pages stored in compressed form.
X.NH 2
XAutomated Consistency Checking
X.PP
XAfter receiving a half-dozen or so bug reports regarding
Xnon-existent man pages referenced in \s-1SEE\s0 \s-1ALSO\s0 sections,
Xit became apparent that the only way to verify that all
Xbugs of this nature had really been expurgated would be to automate the process.
XThe
X.I cfman
Xprogram
Xverifies that man pages
Xare mutually consistent in their \s-1SEE\s0 \s-1ALSO\s0 references. It
Xalso reports man pages whose
X.LB .TH
Xline claims the man page is in
Xa different place than
X.I cfman
Xfound it.
X.I Cfman
Xcan locate man pages
Xthat are improperly referenced rather than merely missing. It
Xcan be run on an entire man tree, or on individual files as
Xan aid to developers writing new man pages.
X.BF "Sample \fIcfman\fP run"
Xat.1: cron(8) really in cron(1)
Xbinmail.1: xsend(1) missing
Xdbadd.1: dbm(3) really in dbm(3x)
Xksh.1: exec(2) missing
Xksh.1: signal(2) missing
Xksh.1: ulimit(2) missing
Xksh.1: rand(3) really in rand(3c)
Xksh.1: profile(5) missing
Xld.1: fc(1) really in fc(1f)
Xsccstorcs.1: thinks it's in ci(1)
Xuuencode.1c: atob(n) missing
Xyppasswd.1: mkpasswd(5) missing
Xfstream.3: thinks it's in fstream(3c++)
Xftpd.8c: syslog(8) missing
Xnfmail.8: delivermail(8) missing
Xversatec.8: vpr(1) missing
X.EF
X.PP
XThe amount of output produced by
X.I cfman
Xis startling.
XA portion of the output of a sample run
Xis seen in Figure 6.
XSome of its complaints are relatively harmless, such as
X.I dbm
Xbeing in section
X.B 3x
Xrather than section
X\fB3\fR, because the
X.I man
Xprogram can find entries with the subsection left off.
XHaving inconsistent
X.LB .TH
Xheaders is also harmless, although the printed
Xman pages will have headers that do not reflect their
Xfilenames on the disk.
XHowever, entries that refer to pages that are truly absent, like
X.M exec 2
Xor
X.M delivermail 8 ,
Xmerit closer attention.
X.NH 2
XMultiple Architecture Support
X.PP
XAs mentioned in the discussion of the need for a \s-1MANPATH\s0,
Xa site may for various reasons wish to maintain several
Xcomplete sets of man pages on the same machine. Of course,
Xa user could know to specify the full pathname of the
Xalternate tree on the command line
Xor set up their environment appropriately, but this is
Xinconvenient. Instead, it is preferable
Xto specify the machine type on the command line and let
Xthe system worry about pathnames.
X.ne 5
XConsider these examples:
X.br
X.CW
X.nf
X.na
X.in +.5i
Xman vax csh
Xapropos sun rpc
Xwhatis tahoe man
X.in -.5i
X.CE
X.ad
X.fi
X.PP
XTo implement this,
Xwhen presented with more than one argument,
X.I man
X(in any of its three guises)
Xchecks to see whether the first non-switch argument
Xis a directory beneath
X.I /usr/man .
XIf so, it automatically adjusts its \s-1MANPATH\s0 to that subdirectory.
X.PP
XNot all vendors use precisely the same set of
X.M man 7
Xmacros for formatting their man pages. Furthermore, it's
Xhelpful to see in the header of the man page which manual
Xit came from. The
X.I man
Xprogram therefore looks for a local
X.I tmac.an
Xfile in the root of the current man tree for alternate macro
Xdefinitions. If this file exists, it will be used rather than
Xthe system defaults for passing to
X.I nroff
Xor
X.I troff
Xwhen reformatting.
X.NH
XPerformance Analysis
X.PP
XThe
X.I man
Xprogram is one that is often used on the system,
Xso users are sensitive to any significant degradation
Xin response time. Because it is written in perl (an
Xinterpreted language) this was cause for concern.
XOn a \s-1CONVEX C2\s0, the C version runs faster when only
Xone element is present in the \s-1MANPATH\s0.
XHowever, when the \s-1MANPATH\s0 contains four
Xelements, the C version bogs down considerably because of
Xthe large number of
X.M access 2
Xcalls it must make.
X.PP
XThe start-up time on the parsing
Xof the script, now just over 1300 lines long, is around
X0.6 seconds. This time can be reduced by dumping the
Xparse tree that perl generates to disk and executing that instead.
XThe expense of this action is disk space, as the current implementation
Xrequires that the whole perl interpreter be included in the
Xnew executable, not just the parse tree. This method
Xyields performance superior to that of the C version,
Xirrespective of the number of components in the user's \s-1MANPATH\s0,
Xexcept occasionally on the initial run. This is because the
Xprogram needs to be loaded
Xinto memory the first time. If perl itself is installed ``sticky''
Xso it is memory resident, start-up time improves considerably.
XIn any case, the
Xtotal variance (on a \s-1CONVEX\s0) is
Xless than two seconds in the worst case (and often
Xunder one second), so it was deemed acceptable, particularly
Xconsidering the additional functionality the perl version offers.
X.PP
XNothing in the algorithms employed in the
X.I man
Xprogram require that it be written in perl;
Xit was just easier this way. It could be rewritten in C
Xusing
X.M dbm 3x
Xroutines, although the development time would probably
Xbe much longer.
X.PP
XThe
X.I makewhatis
Xprogram was originally a conglomeration of man calls to various individual
Xutilities such as
X\fIsed\fP,
X\fIexpand\fP,
X\fIsort\fP, and others. The perl rewrite runs in less than half the time
Xof the original, and does a much better job. There are two
Xreasons for the speed increase. The first is the cost of the numerous
X.M exec 2
Xcalls made via the shell script used by the old version of
X.I makewhatis .
XThe second is that
Xperl is optimized for text processing, which is most of what
X.I makewhatis
Xis doing.
X.PP
XTotal development time was only a few weeks,
Xwhich was much shorter than originally anticipated. The short
Xdevelopment cycle was chiefly attributable to
Xthe ease of text processing in perl, the many built-in
Xroutines for doing things that in C would have required
Xextensive library development, and, last but not at all least,
Xthe omission of the compilation stage in the normal edit-compile-test
Xcycle of development when working with non-interpreted languages.
X.NH
XConclusions
X.PP
XThe system described above has been in operation for the last
Xsix months on a large local network consisting of three dozen
X\s-1CONVEX\s0 machines, a token \s-1VAX\s0, quite a few \s-1HP\s0 workstations
Xand servers, and innumerable Sun workstations, all running different
Xflavors of \s-1UNIX\s0. Despite this heterogeneity,
Xthe same code runs on all systems without alterations.
XFew problems have been seen, and those that did arise were quickly
Xfixed in the scripts, which could be immediately redistributed
Xto the network. The principal project goals of improved functionality,
Xextensibility, and execution time were adequately met, and the
Xexperience of rewriting a set of standard \s-1UNIX\s0 utilities
Xin perl was an educational one.
XMan pages stand a much better chance of being internally consistent
Xwith each other.
XResponse from the user and development community has
Xbeen favorable. They have
Xbeen relieved by the many bug fixes and pleasantly surprised
Xby the new functionality. The suite of man programs will replace
Xthe old man system in the next release of \s-1CONVEX\s0 utilities.
X.\" Should be .BB here but that seems to mutilate my last BF figure
X.sp 3
X.QP
X.I
X.SM
XTom Christiansen left the University of Wisconsin with an \s-1MS-CS\s0
Xin 1987
Xwhere he had been a system administrator for 6 years to join
X\s-1CONVEX\s0
XComputer Corporation in Richardson, Texas.
XHe is a software development engineer
Xin the Internal Tools Group there, designing software tools
Xto streamline software development and systems administration
Xand to improve overall system security.
X.BE
SHAR_EOF
if test 34978 -ne "`wc -c < 'man.ms'`"
then
echo shar: "error transmitting 'man.ms'" '(should have been 34978 characters)'
fi
chmod 664 'man.ms'
fi
echo shar: "extracting 'COPYING'" '(151 characters)'
if test -f 'COPYING'
then
echo shar: "will not over-write existing file 'COPYING'"
else
sed 's/^ X//' << \SHAR_EOF > 'COPYING'
X# You are free to use, modify, and redistribute these scripts
X# as you wish for non-commercial purposes provided that this
X# notice remains intact.
SHAR_EOF
if test 151 -ne "`wc -c < 'COPYING'`"
then
echo shar: "error transmitting 'COPYING'" '(should have been 151 characters)'
fi
chmod 664 'COPYING'
fi
echo shar: "extracting 'man'" '(39119 characters)'
if test -f 'man'
then
echo shar: "will not over-write existing file 'man'"
else
sed 's/^ X//' << \SHAR_EOF > 'man'
X#!/usr/local/bin/perl
X#
X# man - perl rewrite of man system
X# tom christiansen <tchrist at convex.com>
X#
X# Copyright 1990 Convex Computer Corporation.
X# All rights reserved.
X#
X# --------------------------------------------------------------------------
X# begin configuration section
X#
X# this should be adequate for CONVEX systems. if you copy this script
X# to non-CONVEX systems, or have a particularly outre local setup, you may
X# wish to alter some of the defaults.
X# --------------------------------------------------------------------------
X
X$PAGER = $ENV{'PAGER'} || 'more';
X
X# assume "less" pagers want -sf flags, all others must accept -s.
X# note: some less's prefer -r to -f. you might also add -i if supported.
X#
X$is_less = $PAGER =~ /^\S*less(\s+-\S.*)?$/;
X$PAGER .= $is_less ? ' -si' : ' -s'; # add -f if using "ul"
X
X# man roots to look in; you would really rather use a separate tree than
X# manl and mann! see %SECTIONS and $MANALT if you do.
X$MANPATH = &config_path;
X
X# default section precedence
X$MANSECT = $ENV{'MANSECT'} || 'ln16823457po';
X
X# colons optional unless you have multi-char section names
X# note that HP systems want this:
X# $MANSECT = $ENV{'MANSECT'} || '1:1m:6:8:2:3:4:5:7';
X
X# alternate architecture man pages in
More information about the Alt.sources
mailing list