DVIDOC, a postprocessor for DVI files (Part 1 of 2)

SKY ken at rochester.ARPA
Fri Dec 26 08:59:28 AEST 1986


DVIDOC is a postprocessor for TeX DVI files that generates output
for character devices like terminals and line printers.

- It isn't really a previewer because the output will not have page
breaks in the same place as the high resolution version.  It however
does get interword spacing right. It might be useful for documents that
need to be generated in two versions, for high-resolution devices and
for line printers. Briefly, you put \input docmac at the top of the
file before TeXing.  (The macro file tells TeX to use a fixed width
font everywhere.) You remove this command before you do the high
resolution version.

- It was modified from its last incarnation as a Pyramid version.  The
comments indicate that it originated from OSU in 1983. The original is
based on Dvitype.

- It has only been tested in 4.2 BSD and Sun 2.0 systems. It should not
be difficult to figure out what changes need to be made for your local
environment.  You will need the following programs: tangle, pxp (if
your Pascal doesn't have an otherwise clause for case statements), pc,
pltotf (to generate the tfm file. Yes this is probably ancient but we
still have Unix TeX 1.3), weave and tex (if you want to read the
documentation).

- Since this was a learning experience in WEB for me, I won't be good
at answering questions about the innards. But I will receive bug
reports and fixes gratefully.

- The program is based largely on the work of Don Knuth, David Fuchs
and Howard Trickey. If anything broke, it was probably my hack.

	Ken
	ken at rochester.arpa or ken at cs.rochester.edu
	..!rochester!ken
	January 1987

The distribution consists of two parts to stay under the 64k limit.
Paste the two parts together before unsharing.

And now some questions for TeXperts:

How can I get TeX to use the doc font everywhere? My LaTeX dvi output
still references some high resolution fonts.

How can I get the vertical spacing correct?

#! /bin/sh
# This is a shell archive, meaning:
# 1. Remove everything above the #! /bin/sh line.
# 2. Save the resulting text in a file.
# 3. Execute the file with /bin/sh (not csh) to create:
#	README
#	Makefile
#	dvidoc.1
#	dvidoc.web
#	dvityext.c
#	dvityext.h
#	texpaths.h
#	doc.pl
# This archive created: Thu Dec 25 17:36:15 1986
# By:	SKY ()
export PATH; PATH=/bin:/usr/bin:$PATH
echo shar: "extracting 'README'" '(1600 characters)'
if test -f 'README'
then
	echo shar: "will not over-write existing file 'README'"
else
cat << \SHAR_EOF > 'README'
DVIDOC is a postprocessor for TeX DVI files that generates output
for character devices like terminals and line printers.

- It isn't really a previewer because the output will not have page
breaks in the same place as the high resolution version.  It however
does get interword spacing right. It might be useful for documents that
need to be generated in two versions, for high-resolution devices and
for line printers. Briefly, you put \input docmac at the top of the
file before TeXing.  (The macro file tells TeX to use a fixed width
font everywhere.) You remove this command before you do the high
resolution version.

- It was modified from its last incarnation as a Pyramid version.  The
comments indicate that it originated from OSU in 1983. The original is
based on Dvitype.

- It has only been tested in 4.2 BSD and Sun 2.0 systems. It should not
be difficult to figure out what changes need to be made for your local
environment.  You will need the following programs: tangle, pxp (if
your Pascal doesn't have an otherwise clause for case statements), pc,
pltotf (to generate the tfm file. Yes this is probably ancient but we
still have Unix TeX 1.3), weave and tex (if you want to read the
documentation).

- Since this was a learning experience in WEB for me, I won't be good
at answering questions about the innards. But I will receive bug
reports and fixes gratefully.

- The program is based largely on the work of Don Knuth, David Fuchs
and Howard Trickey. If anything broke, it was probably my hack.

	Ken
	ken at rochester.arpa or ken at cs.rochester.edu
	..!rochester!ken
	January 1987
SHAR_EOF
if test 1600 -ne "`wc -c < 'README'`"
then
	echo shar: "error transmitting 'README'" '(should have been 1600 characters)'
fi
fi
echo shar: "extracting 'Makefile'" '(610 characters)'
if test -f 'Makefile'
then
	echo shar: "will not over-write existing file 'Makefile'"
else
cat << \SHAR_EOF > 'Makefile'
DEST=/usr/local/bin
LIB=/usr/lib/tex/fonts
MACROS=/usr/lib/tex/macros
CFLAGS=-g
PFLAGS=-g
LDFLAGS=-g

.SUFFIXES: .web .o .dvi .pl .tfm
.web.o:
	tangle $<
	pxp -O -f $*.p > tmp
	mv tmp $*.p
	$(PC) -c $(PFLAGS) $*.p
.web.dvi:
	weave $<
	tex $*.tex
	rm $*.tex
.pl.tfm:
	pltotf $< $*.tfm

all: dvidoc doc.tfm dvidoc.dvi

dvidoc: dvidoc.o dvityext.o
	$(PC) -o $@ $(LDFLAGS) dvidoc.o dvityext.o

install: all
	install -m 644 doc.tfm $(LIB)
	install -m 755 dvidoc $(DEST)
	install -m 644 -c docmac.tex $(MACROS)

clean:
	-rm -f *.o dvidoc.p dvidoc dvidoc.log dvidoc.tex dvidoc.pool \
	CONTENTS.tex dvidoc.dvi doc.tfm
SHAR_EOF
if test 610 -ne "`wc -c < 'Makefile'`"
then
	echo shar: "error transmitting 'Makefile'" '(should have been 610 characters)'
fi
fi
echo shar: "extracting 'dvidoc.1'" '(1292 characters)'
if test -f 'dvidoc.1'
then
	echo shar: "will not over-write existing file 'dvidoc.1'"
else
cat << \SHAR_EOF > 'dvidoc.1'
.\" $Header
.if t .ds TX T\h'-.1667m'\v'.22m'E\h'-.125m'\v'-.22m'X
.if n .ds TX TeX
.if t .ds LX L\v'-.22m'a\v'.22m'T\h'-.1667m'\v'.22m'E\h'-.125m'\v'-.22m'X
.if n .ds LX LaTeX
.TH DVIDOC 1
.UC
.SH NAME
dvidoc \- print TeX DVI files out on a file
.SH SYNOPSIS
dvidoc
.I dvifile
.I outputfile
.SH DESCRIPTION
.IR Dvidoc (1)
attempts to print the contents of a DVI file on a regular ASCII file.
The output file will not be an exact representation of the DVI file
since \*(TX output is designed for a high-resolution device.
The output will be an approximate representation of paragraph breaks
and it provides a simple quick means to preview output at your terminal.
Math mode and tables are not handled well.
Also, page breaks will be different when output to a high-resolution device.
.PP
To use 
.I dvidoc
you must 
.I \input
the macro file
.I docmac.tex
in the beginning of your \*(TX or \*(LX document.
This file redefines the width of the roman, italic, boldface and typewriter
fonts.
When you wish to output the final document to the laser printer or other
high resolution device you should remove the include file
.I docmac.tex
and re\*(TX your document.
.SH FILES
.nf
.ta \w'/usr/lib/tex/macros/docmac.tex   'u
/usr/lib/tex/macros/docmac.tex	Input macro package
.fi
.SH SEE ALSO
tex(1)
SHAR_EOF
if test 1292 -ne "`wc -c < 'dvidoc.1'`"
then
	echo shar: "error transmitting 'dvidoc.1'" '(should have been 1292 characters)'
fi
fi
echo shar: "extracting 'dvidoc.web'" '(91548 characters)'
if test -f 'dvidoc.web'
then
	echo shar: "will not over-write existing file 'dvidoc.web'"
else
cat << \SHAR_EOF > 'dvidoc.web'
% This is DVIDOC, a TeX device driver for text files.  It was written
% at OSU in April, 1983, by modifying the TeX utility DVItype.
% Modified to work under 4.3 BSD by Ken Yap (ken at rochester)

% Here is TeX material that gets inserted after \input webmac
\def\hang{\hangindent 3em\indent\ignorespaces}
\font\ninerm=amr9
\let\mc=\ninerm % medium caps for names like SAIL
\def\PASCAL{Pascal}

\def\(#1){} % this is used to make section names sort themselves better
\def\9#1{} % this is used for sort keys in the index

\def\title{DVIDOC}
\def\contentspagenumber{1}
\def\topofcontents{\null
  \def\titlepage{F} % include headline on the contents page
  \def\rheader{\mainfont\hfil \contentspagenumber}
  \vfill
  \centerline{\titlefont The {\ttitlefont DVIDOC} processor}
  \vskip 15pt
  \centerline{(Version 1, January 1987)}
  \vfill}
\def\botofcontents{\vfill
  \centerline{\hsize 5in\baselineskip9pt
    \vbox{\ninerm\noindent
   `\TeX' is a
    trademark of the American Mathematical Society.}}}
\pageno=\contentspagenumber \advance\pageno by 1

@* Introduction.
The \.{DVIDOC} utility program reads binary device-independent (``\.{DVI}'')
files that are produced by document compilers such as \TeX, and 
approximates the intended document as a text file suitable for typing at
a terminal or on a line printer.

This program is based on the program \.{DVItype}, which was written by
Donald Knuth and David Fuchs.
It contained a great deal of code checking for malformed \.{DVI}
files.  Most of that code remains in \.{DVIDOC}, not because it is
important (we trust \TeX) to produce correct \.{DVI} files), but
because is was easier not to disturb the logic in modifying
\.{DVItype} to produce \.{DVIDOC}.

The |banner| string defined here should be changed whenever \.{DVIDOC}
gets modified.

@d banner=='This is DVIDOC, Version 1 for 4.3 BSD' {printed when the program starts}

@ This program is written in standard \PASCAL, except where it is necessary
to use extensions; for example, \.{DVIDOC} must read files whose names
are dynamically specified, and that would be impossible in pure \PASCAL.
All places where nonstandard constructions are used have been listed in
the index under ``system dependencies.''
@!@^system dependencies@>

One of the extensions to standard \PASCAL\ that we shall deal with is the
ability to move to a random place in a binary file.

Another extension is to use a default |case| as in \.{TANGLE}, \.{WEAVE},
etc.

@d othercases == others: {default for cases not listed explicitly}
@d endcases == @+end {follows the default case in an extended |case| statement}
@f othercases == else
@f endcases == end

@ The binary input comes from |dvi_file|, and the symbolic output is written
on |doc_file| file. The term |print| is used instead of
|write| when this program writes on |output|, so that all such output
could easily be redirected if desired.

@d print(#)==write(#)
@d print_ln(#)==write_ln(#)
@d error(#)==message(' ',#)

@d term_in==input
@d term_out==output

@d eof(#)==eof_dvi_file

@p program DVIDOC(input, output, @!dvi_file,@!doc_file);
label @<Labels in the outer block@>@/
const @<Constants in the outer block@>@/
type @<Types in the outer block@>@/
var@?@<Globals in the outer block@>@/
@\@=#include "dvityext.h"@>@\ {declarations for external C procedures}
procedure initialize; {this procedure gets things started properly}
var i:integer;
  begin @/
  setpaths;
  @<Set initial values@>@/
  end;

@ If the program has to stop prematurely, it goes to the
`|final_end|'. Another label, |done|, is used when stopping normally.

@d final_end=9999 {label for the end of it all}
@d done=30 {go here when finished with a subtask}

@<Labels...@>=final_end,done;

@ The following parameters can be changed at compile time to extend or
reduce \.{DVIDOC}'s capacity.

@<Constants...@>=
@!max_fonts=100; {maximum number of distinct fonts per \.{DVI} file}
@!max_widths=10000; {maximum number of different characters among all fonts}
@!terminal_line_length=150; {maximum number of characters input in a single
  line of input from the terminal}
@!stack_size=100; {\.{DVI} files shouldn't |push| beyond this depth}
@!name_size=1000; {total length of all font file names}
@!name_length=100; {a file name shouldn't be longer than this}
@!page_width_max=132; {maximum number of characters per line in the document}
@!page_length_max=88; {maximum number of lines per page in the document}

@ Here are some macros for common programming idioms.

@d incr(#) == #:=#+1 {increase a variable by unity}
@d decr(#) == #:=#-1 {decrease a variable by unity}
@d do_nothing == {empty statement}

@ If the \.{DVI} file is badly malformed, the whole process must be aborted;
\.{DVIDOC} will give up, after issuing an error message about the symptoms
that were noticed.

Such errors might be discovered inside of subroutines inside of subroutines,
so a procedure called |jump_out| has been introduced. This procedure, which
simply transfers control to the label |final_end| at the end of the program,
contains the only non-local |goto| statement in \.{DVIDOC}.
@^system dependencies@>

@d abort(#)==begin message(' ',#); jump_out;
    end
@d bad_dvi(#)==abort('Bad DVI file: ',#,'!')
@.Bad DVI file@>

@p procedure jump_out;
begin goto final_end;
end;

@* The character set.
Like all programs written with the  \.{WEB} system, \.{DVIDOC} can be
used with any character set. But it uses ASCII code internally, because
the programming for portable input-output is easier when a fixed internal
code is used, and because \.{DVI} files use ASCII code for file names
and certain other strings.

The next few modules of \.{DVIDOC} have therefore been copied from the
analogous ones in the \.{WEB} system routines. They have been considerably
simplified, since \.{DVIDOC} need not deal with the controversial
ASCII codes less than @'40. If such codes appear in the \.{DVI} file,
they will be printed as question marks.

@<Types...@>=
@!ASCII_code=" ".."~"; {a subrange of the integers}

@ The original \PASCAL\ compiler was designed in the late 60s, when six-bit
character sets were common, so it did not make provision for lower case
letters. Nowadays, of course, we need to deal with both upper and lower case
alphabets in a convenient way, especially in a program like \.{DVIDOC}.
So we shall assume that the \PASCAL\ system being used for \.{DVIDOC}
has a character set containing at least the standard visible characters
of ASCII code (|"!"| through |"~"|).

Some \PASCAL\ compilers use the original name |char| for the data type
associated with the characters in text files, while other \PASCAL s
consider |char| to be a 64-element subrange of a larger data type that has
some other name.  In order to accommodate this difference, we shall use
the name |text_char| to stand for the data type of the characters in the
output file.  We shall also assume that |text_char| consists of
the elements |chr(first_text_char)| through |chr(last_text_char)|,
inclusive. The following definitions should be adjusted if necessary.
@^system dependencies@>

@d text_char == char {the data type of characters in text files}
@d first_text_char=0 {ordinal number of the smallest element of |text_char|}
@d last_text_char=127 {ordinal number of the largest element of |text_char|}

@<Types...@>=
@!text_file=packed file of text_char;

@ The \.{DVIDOC} processor converts between ASCII code and
the user's external character set by means of arrays |xord| and |xchr|
that are analogous to \PASCAL's |ord| and |chr| functions.

@<Globals...@>=
@!xord: array [text_char] of ASCII_code;
  {specifies conversion of input characters}
@!xchr: array [0..255] of text_char;
  {specifies conversion of output characters}

@ Under our assumption that the visible characters of standard ASCII are
all present, the following assignment statements initialize the
|xchr| array properly, without needing any system-dependent changes.

@<Set init...@>=
for i:=0 to @'37 do xchr[i]:='?';
xchr[@'40]:=' ';
xchr[@'41]:='!';
xchr[@'42]:='"';
xchr[@'43]:='#';
xchr[@'44]:='$';
xchr[@'45]:='%';
xchr[@'46]:='&';
xchr[@'47]:='''';@/
xchr[@'50]:='(';
xchr[@'51]:=')';
xchr[@'52]:='*';
xchr[@'53]:='+';
xchr[@'54]:=',';
xchr[@'55]:='-';
xchr[@'56]:='.';
xchr[@'57]:='/';@/
xchr[@'60]:='0';
xchr[@'61]:='1';
xchr[@'62]:='2';
xchr[@'63]:='3';
xchr[@'64]:='4';
xchr[@'65]:='5';
xchr[@'66]:='6';
xchr[@'67]:='7';@/
xchr[@'70]:='8';
xchr[@'71]:='9';
xchr[@'72]:=':';
xchr[@'73]:=';';
xchr[@'74]:='<';
xchr[@'75]:='=';
xchr[@'76]:='>';
xchr[@'77]:='?';@/
xchr[@'100]:='@@';
xchr[@'101]:='A';
xchr[@'102]:='B';
xchr[@'103]:='C';
xchr[@'104]:='D';
xchr[@'105]:='E';
xchr[@'106]:='F';
xchr[@'107]:='G';@/
xchr[@'110]:='H';
xchr[@'111]:='I';
xchr[@'112]:='J';
xchr[@'113]:='K';
xchr[@'114]:='L';
xchr[@'115]:='M';
xchr[@'116]:='N';
xchr[@'117]:='O';@/
xchr[@'120]:='P';
xchr[@'121]:='Q';
xchr[@'122]:='R';
xchr[@'123]:='S';
xchr[@'124]:='T';
xchr[@'125]:='U';
xchr[@'126]:='V';
xchr[@'127]:='W';@/
xchr[@'130]:='X';
xchr[@'131]:='Y';
xchr[@'132]:='Z';
xchr[@'133]:='[';
xchr[@'134]:='\';
xchr[@'135]:=']';
xchr[@'136]:='^';
xchr[@'137]:='_';@/
xchr[@'140]:='`';
xchr[@'141]:='a';
xchr[@'142]:='b';
xchr[@'143]:='c';
xchr[@'144]:='d';
xchr[@'145]:='e';
xchr[@'146]:='f';
xchr[@'147]:='g';@/
xchr[@'150]:='h';
xchr[@'151]:='i';
xchr[@'152]:='j';
xchr[@'153]:='k';
xchr[@'154]:='l';
xchr[@'155]:='m';
xchr[@'156]:='n';
xchr[@'157]:='o';@/
xchr[@'160]:='p';
xchr[@'161]:='q';
xchr[@'162]:='r';
xchr[@'163]:='s';
xchr[@'164]:='t';
xchr[@'165]:='u';
xchr[@'166]:='v';
xchr[@'167]:='w';@/
xchr[@'170]:='x';
xchr[@'171]:='y';
xchr[@'172]:='z';
xchr[@'173]:='{';
xchr[@'174]:='|';
xchr[@'175]:='}';
xchr[@'176]:='~';
for i:=@'177 to 255 do xchr[i]:='?';

@ The following system-independent code makes the |xord| array contain a
suitable inverse to the information in |xchr|.

@<Set init...@>=
for i:=first_text_char to last_text_char do xord[chr(i)]:=@'40;
for i:=" " to "~" do xord[xchr[i]]:=i;

@* Device-independent file format.
Before we get into the details of \.{DVItype}, we need to know exactly
what \.{DVI} files are. The form of such files was designed by David R.
@^Fuchs, David Raymond@>
Fuchs in 1979. Almost any reasonable typesetting device can be driven by
a program that takes \.{DVI} files as input, and dozens of such
\.{DVI}-to-whatever programs have been written. Thus, it is possible to
print the output of document compilers like \TeX\ on many different kinds
of equipment.

A \.{DVI} file is a stream of 8-bit bytes, which may be regarded as a
series of commands in a machine-like language. The first byte of each command
is the operation code, and this code is followed by zero or more bytes
that provide parameters to the command. The parameters themselves may consist
of several consecutive bytes; for example, the `|set_rule|' command has two
parameters, each of which is four bytes long. Parameters are usually
regarded as nonnegative integers; but four-byte-long parameters,
and shorter parameters that denote distances, can be
either positive or negative. Such parameters are given in two's complement
notation. For example, a two-byte-long distance parameter has a value between
$-2^{15}$ and $2^{15}-1$.
@.DVI {\rm files}@>

A \.{DVI} file consists of a ``preamble,'' followed by a sequence of one
or more ``pages,'' followed by a ``postamble.'' The preamble is simply a
|pre| command, with its parameters that define the dimensions used in the
file; this must come first.  Each ``page'' consists of a |bop| command,
followed by any number of other commands that tell where characters are to
be placed on a physical page, followed by an |eop| command. The pages
appear in the order that they were generated, not in any particular
numerical order. If we ignore |nop| commands and \\{fnt\_def} commands
(which are allowed between any two commands in the file), each |eop|
command is immediately followed by a |bop| command, or by a |post|
command; in the latter case, there are no more pages in the file, and the
remaining bytes form the postamble.  Further details about the postamble
will be explained later.

Some parameters in \.{DVI} commands are ``pointers.'' These are four-byte
quantities that give the location number of some other byte in the file;
the first byte is number~0, then comes number~1, and so on. For example,
one of the parameters of a |bop| command points to the previous |bop|;
this makes it feasible to read the pages in backwards order, in case the
results are being directed to a device that stacks its output face up.
Suppose the preamble of a \.{DVI} file occupies bytes 0 to 99. Now if the
first page occupies bytes 100 to 999, say, and if the second
page occupies bytes 1000 to 1999, then the |bop| that starts in byte 1000
points to 100 and the |bop| that starts in byte 2000 points to 1000. (The
very first |bop|, i.e., the one that starts in byte 100, has a pointer of $-1$.)

@ The \.{DVI} format is intended to be both compact and easily interpreted
by a machine. Compactness is achieved by making most of the information
implicit instead of explicit. When a \.{DVI}-reading program reads the
commands for a page, it keeps track of several quantities: (a)~The current
font |f| is an integer; this value is changed only
by \\{fnt} and \\{fnt\_num} commands. (b)~The current position on the page
is given by two numbers called the horizontal and vertical coordinates,
|h| and |v|. Both coordinates are zero at the upper left corner of the page;
moving to the right corresponds to increasing the horizontal coordinate, and
moving down corresponds to increasing the vertical coordinate. Thus, the
coordinates are essentially Cartesian, except that vertical directions are
flipped; the Cartesian version of |(h,v)| would be |(h,-v)|.  (c)~The
current spacing amounts are given by four numbers |w|, |x|, |y|, and |z|,
where |w| and~|x| are used for horizontal spacing and where |y| and~|z|
are used for vertical spacing. (d)~There is a stack containing
|(h,v,w,x,y,z)| values; the \.{DVI} commands |push| and |pop| are used to
change the current level of operation. Note that the current font~|f| is
not pushed and popped; the stack contains only information about
positioning.

The values of |h|, |v|, |w|, |x|, |y|, and |z| are signed integers having up
to 32 bits, including the sign. Since they represent physical distances,
there is a small unit of measurement such that increasing |h| by~1 means
moving a certain tiny distance to the right. The actual unit of
measurement is variable, as explained below.

@ Here is a list of all the commands that may appear in a \.{DVI} file. Each
command is specified by its symbolic name (e.g., |bop|), its opcode byte
(e.g., 139), and its parameters (if any). The parameters are followed
by a bracketed number telling how many bytes they occupy; for example,
`|p[4]|' means that parameter |p| is four bytes long.

\yskip\hang|set_char_0| 0. Typeset character number~0 from font~|f|
such that the reference point of the character is at |(h,v)|. Then
increase |h| by the width of that character. Note that a character may
have zero or negative width, so one cannot be sure that |h| will advance
after this command; but |h| usually does increase.

\yskip\hang|set_char_1| through |set_char_127| (opcodes 1 to 127).
Do the operations of |set_char_0|; but use the character whose number
matches the opcode, instead of character~0.

\yskip\hang|set1| 128 |c[1]|. Same as |set_char_0|, except that character
number~|c| is typeset. \TeX82 uses this command for characters in the
range |128<=c<256|.

\yskip\hang|set2| 129 |c[2]|. Same as |set1|, except that |c|~is two
bytes long, so it is in the range |0<=c<65536|. \TeX82 never uses this
command, which is intended for processors that deal with oriental languages;
but \.{DVItype} will allow character codes greater than 255, assuming that
they all have the same width as the character whose code is $c \bmod 256$.
@^oriental characters@>@^Chinese characters@>@^Japanese characters@>

\yskip\hang|set3| 130 |c[3]|. Same as |set1|, except that |c|~is three
bytes long, so it can be as large as $2^{24}-1$.

\yskip\hang|set4| 131 |c[4]|. Same as |set1|, except that |c|~is four
bytes long, possibly even negative. Imagine that.

\yskip\hang|set_rule| 132 |a[4]| |b[4]|. Typeset a solid black rectangle
of height |a| and width |b|, with its bottom left corner at |(h,v)|. Then
set |h:=h+b|. If either |a<=0| or |b<=0|, nothing should be typeset. Note
that if |b<0|, the value of |h| will decrease even though nothing else happens.
Programs that typeset from \.{DVI} files should be careful to make the rules
line up carefully with digitized characters, as explained in connection with
the |rule_pixels| subroutine below.

\yskip\hang|put1| 133 |c[1]|. Typeset character number~|c| from font~|f|
such that the reference point of the character is at |(h,v)|. (The `put'
commands are exactly like the `set' commands, except that they simply put out a
character or a rule without moving the reference point afterwards.)

\yskip\hang|put2| 134 |c[2]|. Same as |set2|, except that |h| is not changed.

\yskip\hang|put3| 135 |c[3]|. Same as |set3|, except that |h| is not changed.

\yskip\hang|put4| 136 |c[4]|. Same as |set4|, except that |h| is not changed.

\yskip\hang|put_rule| 137 |a[4]| |b[4]|. Same as |set_rule|, except that
|h| is not changed.

\yskip\hang|nop| 138. No operation, do nothing. Any number of |nop|'s
may occur between \.{DVI} commands, but a |nop| cannot be inserted between
a command and its parameters or between two parameters.

\yskip\hang|bop| 139 $c_0[4]$ $c_1[4]$ $\ldots$ $c_9[4]$ $p[4]$. Beginning
of a page: Set |(h,v,w,x,y,z):=(0,0,0,0,0,0)| and set the stack empty. Set
the current font |f| to an undefined value.  The ten $c_i$ parameters can
be used to identify pages, if a user wants to print only part of a \.{DVI}
file; \TeX82 gives them the values of \.{\\count0} $\ldots$ \.{\\count9}
at the time \.{\\shipout} was invoked for this page.  The parameter |p|
points to the previous |bop| command in the file, where the first |bop|
has $p=-1$.

\yskip\hang|eop| 140.  End of page: Print what you have read since the
previous |bop|. At this point the stack should be empty. (The \.{DVI}-reading
programs that drive most output devices will have kept a buffer of the
material that appears on the page that has just ended. This material is
largely, but not entirely, in order by |v| coordinate and (for fixed |v|) by
|h|~coordinate; so it usually needs to be sorted into some order that is
appropriate for the device in question. \.{DVItype} does not do such sorting.)

\yskip\hang|push| 141. Push the current values of |(h,v,w,x,y,z)| onto the
top of the stack; do not change any of these values. Note that |f| is
not pushed.

\yskip\hang|pop| 142. Pop the top six values off of the stack and assign
them to |(h,v,w,x,y,z)|. The number of pops should never exceed the number
of pushes, since it would be highly embarrassing if the stack were empty
at the time of a |pop| command.

\yskip\hang|right1| 143 |b[1]|. Set |h:=h+b|, i.e., move right |b| units.
The parameter is a signed number in two's complement notation, |-128<=b<128|;
if |b<0|, the reference point actually moves left.

\yskip\hang|right2| 144 |b[2]|. Same as |right1|, except that |b| is a
two-byte quantity in the range |-32768<=b<32768|.

\yskip\hang|right3| 145 |b[3]|. Same as |right1|, except that |b| is a
three-byte quantity in the range |@t$-2^{23}$@><=b<@t$2^{23}$@>|.

\yskip\hang|right4| 146 |b[4]|. Same as |right1|, except that |b| is a
four-byte quantity in the range |@t$-2^{31}$@><=b<@t$2^{31}$@>|.

\yskip\hang|w0| 147. Set |h:=h+w|; i.e., move right |w| units. With luck,
this parameterless command will usually suffice, because the same kind of motion
will occur several times in succession; the following commands explain how
|w| gets particular values.

\yskip\hang|w1| 148 |b[1]|. Set |w:=b| and |h:=h+b|. The value of |b| is a
signed quantity in two's complement notation, |-128<=b<128|. This command
changes the current |w|~spacing and moves right by |b|.

\yskip\hang|w2| 149 |b[2]|. Same as |w1|, but |b| is a two-byte-long
parameter, |-32768<=b<32768|.

\yskip\hang|w3| 150 |b[3]|. Same as |w1|, but |b| is a three-byte-long
parameter, |@t$-2^{23}$@><=b<@t$2^{23}$@>|.

\yskip\hang|w4| 151 |b[4]|. Same as |w1|, but |b| is a four-byte-long
parameter, |@t$-2^{31}$@><=b<@t$2^{31}$@>|.

\yskip\hang|x0| 152. Set |h:=h+x|; i.e., move right |x| units. The `|x|'
commands are like the `|w|' commands except that they involve |x| instead
of |w|.

\yskip\hang|x1| 153 |b[1]|. Set |x:=b| and |h:=h+b|. The value of |b| is a
signed quantity in two's complement notation, |-128<=b<128|. This command
changes the current |x|~spacing and moves right by |b|.

\yskip\hang|x2| 154 |b[2]|. Same as |x1|, but |b| is a two-byte-long
parameter, |-32768<=b<32768|.

\yskip\hang|x3| 155 |b[3]|. Same as |x1|, but |b| is a three-byte-long
parameter, |@t$-2^{23}$@><=b<@t$2^{23}$@>|.

\yskip\hang|x4| 156 |b[4]|. Same as |x1|, but |b| is a four-byte-long
parameter, |@t$-2^{31}$@><=b<@t$2^{31}$@>|.

\yskip\hang|down1| 157 |a[1]|. Set |v:=v+a|, i.e., move down |a| units.
The parameter is a signed number in two's complement notation, |-128<=a<128|;
if |a<0|, the reference point actually moves up.

\yskip\hang|down2| 158 |a[2]|. Same as |down1|, except that |a| is a
two-byte quantity in the range |-32768<=a<32768|.

\yskip\hang|down3| 159 |a[3]|. Same as |down1|, except that |a| is a
three-byte quantity in the range |@t$-2^{23}$@><=a<@t$2^{23}$@>|.

\yskip\hang|down4| 160 |a[4]|. Same as |down1|, except that |a| is a
four-byte quantity in the range |@t$-2^{31}$@><=a<@t$2^{31}$@>|.

\yskip\hang|y0| 161. Set |v:=v+y|; i.e., move down |y| units. With luck,
this parameterless command will usually suffice, because the same kind of motion
will occur several times in succession; the following commands explain how
|y| gets particular values.

\yskip\hang|y1| 162 |a[1]|. Set |y:=a| and |v:=v+a|. The value of |a| is a
signed quantity in two's complement notation, |-128<=a<128|. This command
changes the current |y|~spacing and moves down by |a|.

\yskip\hang|y2| 163 |a[2]|. Same as |y1|, but |a| is a two-byte-long
parameter, |-32768<=a<32768|.

\yskip\hang|y3| 164 |a[3]|. Same as |y1|, but |a| is a three-byte-long
parameter, |@t$-2^{23}$@><=a<@t$2^{23}$@>|.

\yskip\hang|y4| 165 |a[4]|. Same as |y1|, but |a| is a four-byte-long
parameter, |@t$-2^{31}$@><=a<@t$2^{31}$@>|.

\yskip\hang|z0| 166. Set |v:=v+z|; i.e., move down |z| units. The `|z|' commands
are like the `|y|' commands except that they involve |z| instead of |y|.

\yskip\hang|z1| 167 |a[1]|. Set |z:=a| and |v:=v+a|. The value of |a| is a
signed quantity in two's complement notation, |-128<=a<128|. This command
changes the current |z|~spacing and moves down by |a|.

\yskip\hang|z2| 168 |a[2]|. Same as |z1|, but |a| is a two-byte-long
parameter, |-32768<=a<32768|.

\yskip\hang|z3| 169 |a[3]|. Same as |z1|, but |a| is a three-byte-long
parameter, |@t$-2^{23}$@><=a<@t$2^{23}$@>|.

\yskip\hang|z4| 170 |a[4]|. Same as |z1|, but |a| is a four-byte-long
parameter, |@t$-2^{31}$@><=a<@t$2^{31}$@>|.

\yskip\hang|fnt_num_0| 171. Set |f:=0|. Font 0 must previously have been
defined by a \\{fnt\_def} instruction, as explained below.

\yskip\hang|fnt_num_1| through |fnt_num_63| (opcodes 172 to 234). Set
|f:=1|, \dots, |f:=63|, respectively.

\yskip\hang|fnt1| 235 |k[1]|. Set |f:=k|. \TeX82 uses this command for font
numbers in the range |64<=k<256|.

\yskip\hang|fnt2| 236 |k[2]|. Same as |fnt1|, except that |k|~is two
bytes long, so it is in the range |0<=k<65536|. \TeX82 never generates this
command, but large font numbers may prove useful for specifications of
color or texture, or they may be used for special fonts that have fixed
numbers in some external coding scheme.

\yskip\hang|fnt3| 237 |k[3]|. Same as |fnt1|, except that |k|~is three
bytes long, so it can be as large as $2^{24}-1$.

\yskip\hang|fnt4| 238 |k[4]|. Same as |fnt1|, except that |k|~is four
bytes long; this is for the really big font numbers (and for the negative ones).

\yskip\hang|xxx1| 239 |k[1]| |x[k]|. This command is undefined in
general; it functions as a $(k+2)$-byte |nop| unless special \.{DVI}-reading
programs are being used. \TeX82 generates |xxx1| when a short enough
\.{\\special} appears, setting |k| to the number of bytes being sent. It
is recommended that |x| be a string having the form of a keyword followed
by possible parameters relevant to that keyword.

\yskip\hang|xxx2| 240 |k[2]| |x[k]|. Like |xxx1|, but |0<=k<65536|.

\yskip\hang|xxx3| 241 |k[3]| |x[k]|. Like |xxx1|, but |0<=k<@t$2^{24}$@>|.

\yskip\hang|xxx4| 242 |k[4]| |x[k]|. Like |xxx1|, but |k| can be ridiculously
large. \TeX82 uses |xxx4| when |xxx1| would be incorrect.

\yskip\hang|fnt_def1| 243 |k[1]| |c[4]| |s[4]| |d[4]| |a[1]| |l[1]| |n[a+l]|.
Define font |k|, where |0<=k<256|; font definitions will be explained shortly.

\yskip\hang|fnt_def2| 244 |k[2]| |c[4]| |s[4]| |d[4]| |a[1]| |l[1]| |n[a+l]|.
Define font |k|, where |0<=k<65536|.

\yskip\hang|fnt_def3| 245 |k[3]| |c[4]| |s[4]| |d[4]| |a[1]| |l[1]| |n[a+l]|.
Define font |k|, where |0<=k<@t$2^{24}$@>|.

\yskip\hang|fnt_def4| 246 |k[4]| |c[4]| |s[4]| |d[4]| |a[1]| |l[1]| |n[a+l]|.
Define font |k|, where |@t$-2^{31}$@><=k<@t$2^{31}$@>|.

\yskip\hang|pre| 247 |i[1]| |num[4]| |den[4]| |mag[4]| |k[1]| |x[k]|.
Beginning of the preamble; this must come at the very beginning of the
file. Parameters |i|, |num|, |den|, |mag|, |k|, and |x| are explained below.

\yskip\hang|post| 248. Beginning of the postamble, see below.

\yskip\hang|post_post| 249. Ending of the postamble, see below.

\yskip\noindent Commands 250--255 are undefined at the present time.

@ @d set_char_0=0 {typeset character 0 and move right}
@d set1=128 {typeset a character and move right}
@d set_rule=132 {typeset a rule and move right}
@d put1=133 {typeset a character}
@d put_rule=137 {typeset a rule}
@d nop=138 {no operation}
@d bop=139 {beginning of page}
@d eop=140 {ending of page}
@d push=141 {save the current positions}
@d pop=142 {restore previous positions}
@d right1=143 {move right}
@d w0=147 {move right by |w|}
@d w1=148 {move right and set |w|}
@d x0=152 {move right by |x|}
@d x1=153 {move right and set |x|}
@d down1=157 {move down}
@d y0=161 {move down by |y|}
@d y1=162 {move down and set |y|}
@d z0=166 {move down by |z|}
@d z1=167 {move down and set |z|}
@d fnt_num_0=171 {set current font to 0}
@d fnt1=235 {set current font}
@d xxx1=239 {extension to \.{DVI} primitives}
@d xxx4=242 {potentially long extension to \.{DVI} primitives}
@d fnt_def1=243 {define the meaning of a font number}
@d pre=247 {preamble}
@d post=248 {postamble beginning}
@d post_post=249 {postamble ending}
@d undefined_commands==250,251,252,253,254,255

@ The preamble contains basic information about the file as a whole. As
stated above, there are six parameters:
$$\hbox{|@!i[1]| |@!num[4]| |@!den[4]| |@!mag[4]| |@!k[1]| |@!x[k]|.}$$
The |i| byte identifies \.{DVI} format; currently this byte is always set
to~2. (Some day we will set |i=3|, when \.{DVI} format makes another
incompatible change---perhaps in 1992.)

The next two parameters, |num| and |den|, are positive integers that define
the units of measurement; they are the numerator and denominator of a
fraction by which all dimensions in the \.{DVI} file could be multiplied
in order to get lengths in units of $10^{-7}$ meters. (For example, there are
exactly 7227 \TeX\ points in 254 centimeters, and \TeX82 works with scaled
points where there are $2^{16}$ sp in a point, so \TeX82 sets |num=25400000|
and $|den|=7227\cdot2^{16}=473628672$.)
@^sp@>

The |mag| parameter is what \TeX82 calls \.{\\mag}, i.e., 1000 times the
desired magnification. The actual fraction by which dimensions are
multiplied is therefore $mn/1000d$. Note that if a \TeX\ source document
does not call for any `\.{true}' dimensions, and if you change it only by
specifying a different \.{\\mag} setting, the \.{DVI} file that \TeX\
creates will be completely unchanged except for the value of |mag| in the
preamble and postamble. (Fancy \.{DVI}-reading programs allow users to
override the |mag|~setting when a \.{DVI} file is being printed.)

Finally, |k| and |x| allow the \.{DVI} writer to include a comment, which is not
interpreted further. The length of comment |x| is |k|, where |0<=k<256|.

@d id_byte=2 {identifies the kind of \.{DVI} files described here}

@ Font definitions for a given font number |k| contain further parameters
$$\hbox{|c[4]| |s[4]| |d[4]| |a[1]| |l[1]| |n[a+l]|.}$$
The four-byte value |c| is the check sum that \TeX\ (or whatever program
generated the \.{DVI} file) found in the \.{TFM} file for this font;
|c| should match the check sum of the font found by programs that read
this \.{DVI} file.
@^check sum@>

Parameter |s| contains a fixed-point scale factor that is applied to the
character widths in font |k|; font dimensions in \.{TFM} files and other
font files are relative to this quantity, which is always positive and
less than $2^{27}$. It is given in the same units as the other dimensions
of the \.{DVI} file.  Parameter |d| is similar to |s|; it is the ``design
size,'' and it is given in \.{DVI} units that have not been corrected for
the magnification~|mag| found in the preamble.  Thus, font |k| is to be
used at $|mag|\cdot s/1000d$ times its normal size.

The remaining part of a font definition gives the external name of the font,
which is an ASCII string of length |a+l|. The number |a| is the length
of the ``area'' or directory, and |l| is the length of the font name itself;
the standard local system font area is supposed to be used when |a=0|.
The |n| field contains the area in its first |a| bytes.

Font definitions must appear before the first use of a particular font number.
Once font |k| is defined, it must not be defined again; however, we
shall see below that font definitions appear in the postamble as well as
in the pages, so in this sense each font number is defined exactly twice,
if at all. Like |nop| commands and \\{xxx} commands, font definitions can
appear before the first |bop|, or between an |eop| and a |bop|.

@ The last page in a \.{DVI} file is followed by `|post|'; this command
introduces the postamble, which summarizes important facts that \TeX\ has
accumulated about the file, making it possible to print subsets of the data
with reasonable efficiency. The postamble has the form
$$\vbox{\halign{\hbox{#\hfil}\cr
  |post| |p[4]| |num[4]| |den[4]| |mag[4]| |l[4]| |u[4]| |s[2]| |t[2]|\cr
  $\langle\,$font definitions$\,\rangle$\cr
  |post_post| |q[4]| |i[1]| 223's$[{\G}4]$\cr}}$$
Here |p| is a pointer to the final |bop| in the file. The next three
parameters, |num|, |den|, and |mag|, are duplicates of the quantities that
appeared in the preamble.

Parameters |l| and |u| give respectively the height-plus-depth of the tallest
page and the width of the widest page, in the same units as other dimensions
of the file. These numbers might be used by a \.{DVI}-reading program to
position individual ``pages'' on large sheets of film or paper; however,
the standard convention for output on normal size paper is to position each
page so that the upper left-hand corner is exactly one inch from the left
and the top. Experience has shown that it is unwise to design \.{DVI}-to-printer
software that attempts cleverly to center the output; a fixed position of
the upper left corner is easiest for users to understand and to work with.
Therefore |l| and~|u| are often ignored.

Parameter |s| is the maximum stack depth (i.e., the largest excess of
|push| commands over |pop| commands) needed to process this file. Then
comes |t|, the total number of pages (|bop| commands) present.

The postamble continues with font definitions, which are any number of
\\{fnt\_def} commands as described above, possibly interspersed with |nop|
commands. Each font number that is used in the \.{DVI} file must be defined
exactly twice: Once before it is first selected by a \\{fnt} command, and once
in the postamble.

@ The last part of the postamble, following the |post_post| byte that
signifies the end of the font definitions, contains |q|, a pointer to the
|post| command that started the postamble.  An identification byte, |i|,
comes next; this currently equals~2, as in the preamble.

The |i| byte is followed by four or more bytes that are all equal to
the decimal number 223 (i.e., @'337 in octal). \TeX\ puts out four to seven of
these trailing bytes, until the total length of the file is a multiple of
four bytes, since this works out best on machines that pack four bytes per
word; but any number of 223's is allowed, as long as there are at least four
of them. In effect, 223 is a sort of signature that is added at the very end.
@^Fuchs, David Raymond@>

This curious way to finish off a \.{DVI} file makes it feasible for
\.{DVI}-reading programs to find the postamble first, on most computers,
even though \TeX\ wants to write the postamble last. Most operating
systems permit random access to individual words or bytes of a file, so
the \.{DVI} reader can start at the end and skip backwards over the 223's
until finding the identification byte. Then it can back up four bytes, read
|q|, and move to byte |q| of the file. This byte should, of course,
contain the value 248 (|post|); now the postamble can be read, so the
\.{DVI} reader discovers all the information needed for typesetting the
pages. Note that it is also possible to skip through the \.{DVI} file at
reasonably high speed to locate a particular page, if that proves
desirable. This saves a lot of time, since \.{DVI} files used in production
jobs tend to be large.

When \.{DVIDOC} "typesets" a character, it simply puts its ASCII code
into the document file in the proper place according to the rounding of
|h| and |v| to whole character positions.  It may, of course, obliterate
a character previously stored in the same position.  Especially if a
symbol font is being used, the ASCII code may print ultimately as an
entirely different character than the one the document designer originally
intended.  For \.{DVIDOC} to produce more than a rough approximation to 
the intended document, fonts need to be chosen very carefully.

@* Input from binary files.
We have seen that a \.{DVI} file is a sequence of 8-bit bytes. The bytes
appear physically in what is called a `|packed file of 0..255|'
in \PASCAL\ lingo.

Packing is system dependent, and many \PASCAL\ systems fail to implement
such files in a sensible way (at least, from the viewpoint of producing
good production software).  For example, some systems treat all
byte-oriented files as text, looking for end-of-line marks and such
things. Therefore some system-dependent code is often needed to deal with
binary files, even though most of the program in this section of
\.{DVIDOC} is written in standard \PASCAL.
@^system dependencies@>

One common way to solve the problem is to consider files of |integer|
numbers, and to convert an integer in the range $-2^{31}\L x<2^{31}$ to
a sequence of four bytes $(a,b,c,d)$ using the following code, which
avoids the controversial integer division of negative numbers:
$$\vbox{\halign{#\hfil\cr
|if x>=0 then a:=x div @'100000000|\cr
|else begin x:=(x+@'10000000000)+@'10000000000; a:=x div @'100000000+128;|\cr
\quad|end|\cr
|x:=x mod @'100000000;|\cr
|b:=x div @'200000; x:=x mod @'200000;|\cr
|c:=x div @'400; d:=x mod @'400;|\cr}}$$
The four bytes are then kept in a buffer and output one by one. (On 36-bit
computers, an additional division by 16 is necessary at the beginning.
Another way to separate an integer into four bytes is to use/abuse
\PASCAL's variant records, storing an integer and retrieving bytes that are
packed in the same place; {\sl caveat implementor!\/}) It is also desirable
in some cases to read a hundred or so integers at a time, maintaining a
larger buffer.

We shall stick to simple \PASCAL\ in this program, for reasons of clarity,
even if such simplicity is sometimes unrealistic.

@<Types...@>=
@!eight_bits=integer; {unsigned one-byte quantity}
@!byte_file=packed file of char; {files that contain binary data}

@ The program deals with two binary file variables: |dvi_file| is the main
input file that we are translating into symbolic form, and |tfm_file| is
the current font metric file from which character-width information is
being read.

@<Glob...@>=
@!dvi_file:byte_file; {the stuff we are \.{DVI}typing}
@!eof_dvi_file:boolean; {end of file on \.{DVI}file}
@!tfm_file:byte_file; {a font metric file}

@ To prepare these files for input, we |reset| them. An extension of
\PASCAL\ is needed in the case of |tfm_file|, since we want to associate
it with external files whose names are specified dynamically (i.e., not
known at compile time). The following code assumes that `|reset(f,s)|'
does this, when |f| is a file variable and |s| is a string variable that
specifies the file name. If |eof(f)| is true immediately after
|reset(f,s)| has acted, we assume that no file named |s| is accessible.
@^system dependencies@>

@d read_access_mode=4  {``read'' mode for |test_access|}
@d write_access_mode=2 {``write'' mode for |test_access|}

@d no_file_path=0    {no path searching should be done}
@d font_file_path=3  {path specifier for \.{TFM} files}

@p procedure open_tfm_file; {prepares to read packed bytes in |tfm_file|}
begin 
if test_access(read_access_mode, font_file_path) then 
    reset(tfm_file, real_name_of_file) 
else begin
    error('TFM file not found');
    goto done;
    end;
end;
@#

@ If you looked carefully at the preceding code, you probably asked,
``What are |cur_loc| and |cur_name|?'' Good question. They're global
variables: |cur_loc| is the number of the byte about to be read next from
|dvi_file|, and |cur_name| is a string variable that will be set to the
current font metric file name before |open_tfm_file| is called.

@<Glob...@>=
@!cur_loc:integer; {where we are about to look, in |dvi_file|}
@!cur_name,real_name_of_file:
packed array[1..name_length] of char; {external name,
  with no lower case letters}

@ It turns out to be convenient to read four bytes at a time, when we are
inputting from \.{TFM} files. The input goes into global variables
|b0|, |b1|, |b2|, and |b3|, with |b0| getting the first byte and |b3|
the fourth.

@<Glob...@>=
@!b0,@!b1,@!b2,@!b3: eight_bits; {four bytes input at once}

@ The |read_tfm_word| procedure sets |b0| through |b3| to the next
four bytes in the current \.{TFM} file.
@d get_tfm_byte(#) ==
@^system dependencies@>
    read(tfm_file,byte); # := ord(byte);
    if # < 0 then # := # + 256;

@p procedure read_tfm_word;
var byte:char;
begin
    get_tfm_byte(b0); get_tfm_byte(b1); get_tfm_byte(b2); get_tfm_byte(b3);
end;

@ Finally we come to the routines that are used for random reading.
The driver program below needs two such routines: |dvi_length| should
compute the total number of bytes in |dvi_file|, possibly also
causing |eof(dvi_file)| to be true; and |move_to_byte(n)|
should position |dvi_file| so that the next |get_byte| will read byte |n|,
starting with |n=0| for the first byte in the file.
@^system dependencies@>

Such routines are, of course, highly system dependent. They are implemented
here in terms of two assumed system routines called |set_pos| and |cur_pos|.
The call |set_pos(f,n)| moves to item |n| in file |f|, unless |n| is
negative or larger than the total number of items in |f|; in the latter
case, |set_pos(f,n)| moves to the end of file |f|.
The call |cur_pos(f)| gives the total number of items in |f|, if
|eof(f)| is true; we use |cur_pos| only in such a situation.

@p function dvi_length:integer;
begin set_pos(dvi_file,-1); dvi_length:=cur_pos(dvi_file);
end;
@#
procedure move_to_byte(n:integer);
begin set_pos(dvi_file,n); cur_loc:=n;
end;

@* Reading the font information.
\.{DVI} file format does not include information about character widths, since
that would tend to make the files a lot longer. But a program that reads
a \.{DVI} file is supposed to know the widths of the characters that appear
in \\{set\_char} commands. Therefore \.{DVIDOC} looks at the font metric
(\.{TFM}) files for the fonts that are involved.
@.TFM {\rm files}@>

The character-width data appears also in other files (e.g., in \.{GF} files
that specify bit patterns for digitized characters);
thus, it is usually possible for \.{DVI} reading programs to get by with
accessing only one file per font. \.{DVItype} has a comparatively easy
task in this regard, since it needs only a few words of information from
each font; other \.{DVI}-to-printer programs may have to go to some pains to
deal with complications that arise when a large number of large font files
all need to be accessed simultaneously.

@ For purposes of this program, we need to know only two things about a
given character |c| in a given font |f|: (1)~Is |c| a legal character
in~|f|? (2)~If so, what is the width of |c|? We also need to know the
symbolic name of each font, so it can be printed out, and we need to know
the approximate size of inter-word spaces in each font.

The answers to these questions appear implicitly in the following data
structures. The current number of known fonts is |nf|. Each known font has
an internal number |f|, where |0<=f<nf|; the external number of this font,
i.e., its font identification number in the \.{DVI} file, is
|font_num[f]|, and the external name of this font is the string that
occupies positions |font_name[f]| through |font_name[f+1]-1| of the array
|names|. The latter array consists of |ASCII_code| characters, and
|font_name[nf]| is its first unoccupied position.  A horizontal motion
in the range |-4*font_space[f]<h<font_space[f]|
will be treated as a `kern' that is not
indicated in the printouts that \.{DVIDOC} produces between brackets. The
legal characters run from |font_bc[f]| to |font_ec[f]|, inclusive; more
precisely, a given character |c| is valid in font |f| if and only if
|font_bc[f]<=c<=font_ec[f]| and |char_width(f)(c)<>invalid_width|.
Finally, |char_width(f)(c)=width[width_base[f]+c]|, and |width_ptr| is the
first unused position of the |width| array.

@d char_width_end(#)==#]
@d char_width(#)==width[width_base[#]+char_width_end
@d invalid_width==@'17777777777

@<Glob...@>=
@!font_num:array [0..max_fonts] of integer; {external font numbers}
@!font_name:array [0..max_fonts] of 0..name_size; {starting positions
  of external font names}
@!names:array [0..name_size] of ASCII_code; {characters of names}
@!font_check_sum:array [0..max_fonts] of integer; {check sums}
@!font_scaled_size:array [0..max_fonts] of integer; {scale factors}
@!font_design_size:array [0..max_fonts] of integer; {design sizes}
@!font_space:array [0..max_fonts] of integer; {boundary between ``small''
  and ``large'' spaces}
@!font_bc:array [0..max_fonts] of integer; {beginning characters in fonts}
@!font_ec:array [0..max_fonts] of integer; {ending characters in fonts}
@!width_base:array [0..max_fonts] of integer; {index into |width| table}
@!width:array [0..max_widths] of integer; {character widths, in \.{DVI} units}
@!nf:0..max_fonts; {the number of known fonts}
@!width_ptr:0..max_widths; {the number of known character widths}

@ @<Set init...@>=
nf:=0; width_ptr:=0; font_name[0]:=0; font_space[0]:=0;

@ It is, of course, a simple matter to print the name of a given font.

@p procedure print_font(@!f:integer); {|f| is an internal font number}
var k:0..name_size; {index into |names|}
begin if f=nf then print('UNDEFINED!')
@.UNDEFINED@>
else  begin for k:=font_name[f] to font_name[f+1]-1 do
    print(xchr[names[k]]);
  end;
end;

@ An auxiliary array |in_width| is used to hold the widths as they are
input. The global variable |tfm_check_sum| is set to the check sum that
appears in the current \.{TFM} file.

@<Glob...@>=
@!in_width:array[0..255] of integer; {\.{TFM} width data in \.{DVI} units}
@!tfm_check_sum:integer; {check sum found in |tfm_file|}

@ Here is a procedure that absorbs the necessary information from a
\.{TFM} file, assuming that the file has just been successfully reset
so that we are ready to read its first byte. (A complete description of
\.{TFM} file format appears in the documentation of \.{TFtoPL} and will
not be repeated here.) The procedure does not check the \.{TFM} file
for validity, nor does it give explicit information about what is
wrong with a \.{TFM} file that proves to be invalid; \.{DVI}-reading
programs need not do this, since \.{TFM} files are almost always valid,
and since the \.{TFtoPL} utility program has been specifically designed
to diagnose \.{TFM} errors. The procedure simply returns |false| if it
detects anything amiss in the \.{TFM} data.

There is a parameter, |z|, which represents the scaling factor being
used to compute the font dimensions; it must be in the range $0<z<2^{27}$.

@p function in_TFM(@!z:integer):boolean; {input \.{TFM} data or return |false|}
label 9997, {go here when the format is bad}
  9998,  {go here when the information cannot be loaded}
  9999;  {go here to exit}
var k:integer; {index for loops}
@!lh:integer; {length of the header data, in four-byte words}
@!nw:integer; {number of words in the width table}
@!wp:0..max_widths; {new value of |width_ptr| after successful input}
@!alpha,@!beta:integer; {quantities used in the scaling computation}
begin @<Read past the header data; |goto 9997| if there is a problem@>;
@<Store character-width indices at the end of the |width| table@>;
@<Read and convert the width values, setting up the |in_width| table@>;
@<Move the widths from |in_width| to |width|, and append |pixel_width| values@>;
width_ptr:=wp; in_TFM:=true; goto 9999;
9997: print_ln('---not loaded, TFM file is bad');
@.TFM file is bad@>
9998: in_TFM:=false;
9999: end;

@ @<Read past the header...@>=
read_tfm_word; lh:=b2*256+b3;
read_tfm_word; font_bc[nf]:=b0*256+b1; font_ec[nf]:=b2*256+b3;
if font_ec[nf]<font_bc[nf] then font_bc[nf]:=font_ec[nf]+1;
if width_ptr+font_ec[nf]-font_bc[nf]+1>max_widths then
  begin print_ln('---not loaded, DVIDOC needs larger width table');
@.DVIDOC needs larger...@>
    goto 9998;
  end;
wp:=width_ptr+font_ec[nf]-font_bc[nf]+1;
read_tfm_word; nw:=b0*256+b1;
if (nw=0)or(nw>256) then goto 9997;
for k:=1 to 3+lh do
  begin if eof(tfm_file) then goto 9997;
  read_tfm_word;
  if k=4 then
    if b0<128 then tfm_check_sum:=((b0*256+b1)*256+b2)*256+b3
    else tfm_check_sum:=(((b0-256)*256+b1)*256+b2)*256+b3;
  end;

@ @<Store character-width indices...@>=
if wp>0 then for k:=width_ptr to wp-1 do
  begin read_tfm_word;
  if b0>nw then goto 9997;
  width[k]:=b0;
  end;

@ The most important part of |in_TFM| is the width computation, which
involves multiplying the relative widths in the \.{TFM} file by the
scaling factor in the \.{DVI} file. This fixed-point multiplication
must be done with precisely the same accuracy by all \.{DVI}-reading programs,
in order to validate the assumptions made by \.{DVI}-writing programs
like \TeX82.

Let us therefore summarize what needs to be done. Each width in a \.{TFM}
file appears as a four-byte quantity called a |fix_word|.  A |fix_word|
whose respective bytes are $(a,b,c,d)$ represents the number
$$x=\left\{\vcenter{\halign{$#$,\hfil\qquad&if $#$\hfil\cr
b\cdot2^{-4}+c\cdot2^{-12}+d\cdot2^{-20}&a=0;\cr
-16+b\cdot2^{-4}+c\cdot2^{-12}+d\cdot2^{-20}&a=255.\cr}}\right.$$
(No other choices of $a$ are allowed, since the magnitude of a \.{TFM}
dimension must be less than 16.)  We want to multiply this quantity by the
integer~|z|, which is known to be less than $2^{27}$. Let $\alpha=16z$.
If $|z|<2^{23}$, the individual multiplications $b\cdot z$, $c\cdot z$,
$d\cdot z$ cannot overflow; otherwise we will divide |z| by 2, 4, 8, or
16, to obtain a multiplier less than $2^{23}$, and we can compensate for
this later. If |z| has thereby been replaced by $|z|^\prime=|z|/2^e$, let
$\beta=2^{4-e}$; we shall compute
$$\lfloor(b+c\cdot2^{-8}+d\cdot2^{-16})\,z^\prime/\beta\rfloor$$ if $a=0$,
or the same quantity minus $\alpha$ if $a=255$.  This calculation must be
done exactly, for the reasons stated above; the following program does the
job in a system-independent way, assuming that arithmetic is exact on
numbers less than $2^{31}$ in magnitude.

@<Read and convert the width values...@>=
@<Replace |z| by $|z|^\prime$ and compute $\alpha,\beta$@>;
for k:=0 to nw-1 do
  begin read_tfm_word;
  in_width[k]:=(((((b3*z)div@'400)+(b2*z))div@'400)+(b1*z))div beta;
  if b0>0 then if b0<255 then goto 9997
    else in_width[k]:=in_width[k]-alpha;
  end

@ @<Replace |z|...@>=
begin alpha:=16*z; beta:=16;
while z>=@'40000000 do
  begin z:=z div 2; beta:=beta div 2;
  end;
end

@ A \.{DVI}-reading program usually works with font files instead of
\.{TFM} files, so \.{DVIDOC} is atypical in that respect. Font files
should, however, contain exactly the same character width data that is
found in the corresponding \.{TFM}s; check sums are used to help
ensure this. In addition, font files usually also contain the widths of
characters in pixels, since the device-independent character widths of
\.{TFM} files are generally not perfect multiples of pixels.

The |pixel_width| array contains this information; when |width[k]| is the
device-independent width of some character in \.{DVI} units, |pixel_width[k]|
is the corresponding width of that character in an actual font.
The macro |char_pixel_width| is set up to be analogous to |char_width|.

@d char_pixel_width(#)==pixel_width[width_base[#]+char_width_end

@<Glob...@>=
@!pixel_width:array[0..max_widths] of integer; {actual character widths,
  in pixels}
@!horiz_conv:real; {converts \.{DVI} units to horizontal pixels}
@!vert_conv:real; {converts \.{DVI} units to vertical pixels}
@!true_horiz_conv:real; {converts unmagnified \.{DVI} units to pixels}
@!true_vert_conv:real; {converts unmagnified \.{DVI} units to pixels}
@!numerator,@!denominator:integer; {stated conversion ratio}
@!mag:integer; {magnification factor times 1000}

@ The following code computes pixel widths by simply rounding the \.{TFM}
widths to the nearest integer number of pixels, based on the conversion factor
|horiz_conv| that converts \.{DVI} units to pixels. However, such a simple
formula will not be valid for all fonts, and it will often give results that
are off by $\pm1$ when a low-resolution font has been carefully
hand-fitted. For example, a font designer often wants to make the letter `m'
a pixel wider or narrower in order to make the font appear more consistent.
\.{DVI}-to-printer programs should therefore input the correct pixel width
information from font files whenever there is a chance that it may differ.
A warning message may also be desirable in the case that at least one character
is found whose pixel width differs from |conv*width| by more than a full pixel.
@^system dependencies@>

@d horiz_pixel_round(#)==round(horiz_conv*(#))
@d vert_pixel_round(#)==round(vert_conv*(#))

@<Move the widths from |in_width| to |width|, and append |pixel_width| values@>=
width_base[nf]:=width_ptr-font_bc[nf];
if wp>0 then for k:=width_ptr to wp-1 do
  begin width[k]:=in_width[width[k]];
  pixel_width[k]:=horiz_pixel_round(width[k]);
  end

@* Optional modes of output.
\.{DVIDOC} output will vary depending on some
options that the user must specify: The typeout can be confined to a
restricted subset of the pages by specifying the desired starting page and
the maximum number of pages. Furthermore there is an option to specify the
horizontal and vertical
resolution of the printer or display; and there is an option to override the
magnification factor that is stated in the \.{DVI} file.

The starting page is specified by giving a sequence of 1 to 10 numbers or
asterisks separated by dots. For example, the specification `\.{1.*.-5}'
can be used to refer to a page output by \TeX\ when $\.{\\count0}=1$
and $\.{\\count2}=-5$. (Recall that |bop| commands in a \.{DVI} file
are followed by ten `count' values.) An asterisk matches any number,
so the `\.*' in `\.{1.*.-5}' means that \.{\\count1} is ignored when
specifying the first page. If several pages match the given specification,
\.{DVIDOC} will begin with the earliest such page in the file. The
default specification `\.*' (which matches all pages) therefore denotes
the page at the beginning of the file.

When \.{DVIDOC} begins, it engages the user in a brief dialog so that the
options will be specified. This part of \.{DVIDOC} requires nonstandard
\PASCAL\ constructions to handle the online interaction.
@^system dependencies@>

@<Glob...@>=
@!max_pages:integer; {at most this many |bop..eop| pages will be printed}
@!horiz_resolution:real; {pixels per inch}
@!vert_resolution:real; {pixels per inch}
@!new_mag:integer; {if positive, overrides the postamble's magnification}

@ The starting page specification is recorded in two global arrays called
|start_count| and |start_there|. For example, `\.{1.*.-5}' is represented
by |start_there[0]=true|, |start_count[0]=1|, |start_there[1]=false|,
|start_there[2]=true|, |start_count[2]=-5|.
We also set |start_vals=2|, to indicate that count 2 was the last one
mentioned. The other values of |start_count| and |start_there| are not
important, in this example.

@<Glob...@>=
@!start_count:array[0..9] of integer; {count values to select starting page}
@!start_there:array[0..9] of boolean; {is the |start_count| value relevant?}
@!start_vals:0..9; {the last count considered significant}
@!count:array[0..9] of integer; {the count values on the current page}

@ @<Set init...@>=
max_pages:=1000000; start_vals:=0; start_there[0]:=false;

@ Here is a simple subroutine that tests if the current page might be the
starting page.

@p function start_match:boolean; {does |count| match the starting spec?}
var k:0..9;  {loop index}
@!match:boolean; {does everything match so far?}
begin match:=true;
for k:=0 to start_vals do
  if start_there[k]and(start_count[k]<>count[k]) then match:=false;
start_match:=match;
end;

@ The |input_ln| routine waits for the user to type a line at his or her
terminal; then it puts ASCII-code equivalents for the characters on that line
into the |buffer| array. The |term_in| file is used for terminal input,
and |term_out| for terminal output.
@^system dependencies@>

@<Glob...@>=
@!buffer:array[0..terminal_line_length] of ASCII_code;

@ Since the terminal is being used for both input and output, some systems
need a special routine to make sure that the user can see a prompt message
before waiting for input based on that message. (Otherwise the message
may just be sitting in a hidden buffer somewhere, and the user will have
no idea what the program is waiting for.) We shall invoke a system-dependent
subroutine |update_terminal| in order to avoid this problem.
@^system dependencies@>

@d update_terminal == flush(output) {empty the terminal output buffer}

@ During the dialog, \.{DVIDOC} will treat the first blank space in a
line as the end of that line. Therefore |input_ln| makes sure that there
is always at least one blank space in |buffer|.
@^system dependencies@>

@p procedure input_ln; {inputs a line from the terminal}
var k:0..terminal_line_length;
ch:char;
begin update_terminal;
k:=0;
while (k<terminal_line_length)and not eoln(term_in) do
  begin read(term_in,ch); buffer[k]:=xord[ch]; incr(k)
  end;
read_ln(term_in);
buffer[k]:=" ";
end;

@ The global variable |buf_ptr| is used while scanning each line of input;
it points to the first unread character in |buffer|.

@<Glob...@>=
@!buf_ptr:0..terminal_line_length; {the number of characters read}

@ Here is a routine that scans a (possibly signed) integer and computes
the decimal value. If no decimal integer starts at |buf_ptr|, the
value 0 is returned. The integer should be less than $2^{31}$ in
absolute value.

@p function get_integer:integer;
var x:integer; {accumulates the value}
@!negative:boolean; {should the value be negated?}
begin if buffer[buf_ptr]="-" then
  begin negative:=true; incr(buf_ptr);
  end
else negative:=false;
x:=0;
while (buffer[buf_ptr]>="0")and(buffer[buf_ptr]<="9") do
  begin x:=10*x+buffer[buf_ptr]-"0"; incr(buf_ptr);
  end;
if negative then get_integer:=-x @+ else get_integer:=x;
end;

@ The selected options are put into global variables by the |dialog|
procedure, which is called just as \.{DVIDOC} begins.
@^system dependencies@>

@p procedure dialog;
label 1,2,3,4,5;
var k:integer; {loop variable}
begin
write_ln(term_out,banner);
@<Determine the desired |start_count| values@>;
@<Determine the desired |max_pages|@>;
@<Determine the desired |horiz_resolution|@>;
@<Determine the desired |vert_resolution|@>;
@<Determine the desired |new_mag|@>;
end;

@ @<Determine the desired |start...@>=
2: write(term_out,'Starting page (default=*): ');
start_vals:=0; start_there[0]:=false;
input_ln; buf_ptr:=0; k:=0;
if buffer[0]<>" " then
  repeat if buffer[buf_ptr]="*" then
    begin start_there[k]:=false; incr(buf_ptr);
    end
  else  begin start_there[k]:=true; start_count[k]:=get_integer;
    end;
  if (k<9)and(buffer[buf_ptr]=".") then
    begin incr(k); incr(buf_ptr);
    end
  else if buffer[buf_ptr]=" " then start_vals:=k
  else  begin write(term_out,'Type, e.g., 1.*.-5 to specify the ');
    write_ln(term_out,'first page with \count0=1, \count2=-5.');
    goto 2;
    end;
  until start_vals=k

@ @<Determine the desired |max_pages|@>=
3: write(term_out,'Maximum number of pages (default=1000000): ');
max_pages:=1000000; input_ln; buf_ptr:=0;
if buffer[0]<>" " then
  begin max_pages:=get_integer;
  if max_pages<=0 then
    begin write_ln(term_out,'Please type a positive number.');
    goto 3;
    end;
  end

@ @<Determine the desired |horiz_resolution|@>=
1: write(term_out,'Horizontal resolution');
write(term_out,' in pixels per inch (default=10/1): ');
horiz_resolution:=10.0; input_ln; buf_ptr:=0;
if buffer[0]<>" " then
  begin k:=get_integer;
  if (k>0)and(buffer[buf_ptr]="/")and
    (buffer[buf_ptr+1]>"0")and(buffer[buf_ptr+1]<="9") then
    begin incr(buf_ptr); horiz_resolution:=k/get_integer;
    end
  else  begin print('Type a ratio of positive integers;');
    print_ln(' (1 character per mm would be 254/10).');
    goto 1;
    end;
  end

@ @<Determine the desired |vert_resolution|@>=
4: print('Vertical resolution');
print(' in lines per inch (default=6/1): ');
vert_resolution:=6.0; input_ln; buf_ptr:=0;
if buffer[0]<>" " then
  begin k:=get_integer;



More information about the Comp.sources.unix mailing list