hdiff: - source file compare program
sources-request at panda.UUCP
sources-request at panda.UUCP
Mon Feb 10 23:22:42 AEST 1986
Mod.sources: Volume 3, Issue 117
Submitted by: Dennis Bednar <talcott!seismo!rlgvax!dennis>
#! /bin/sh
# This is a shell archive, meaning:
# 1. Remove everything above the #! /bin/sh line.
# 2. Save the resulting text in a file.
# 3. Execute the file with /bin/sh (not csh) to create the files:
# hdiff.hlp
# Makefile
# hdiff.c
# remwhite.c
# stripnl.c
# stripnl.h
# This archive created: Sat Feb 8 07:42:17 1986
export PATH; PATH=/bin:$PATH
echo shar: extracting "'hdiff.hlp'" '(2433 characters)'
if test -f 'hdiff.hlp'
then
echo shar: will not over-write existing file "'hdiff.hlp'"
else
cat << \SHAR_EOF > 'hdiff.hlp'
hdiff [-cdmvw] oldfile newfile
Source file compare program.
Yet another source compare program like diff. This one reports moved lines,
not delete/insert as the UNIX diff does. The h is in honor of Paul Heckel,
the guy who first wrote about this algorithm in CACM July 1978.
One of c,d, or m should be used to adjust the internal algorithm.
Currently I am playing with the algorithm.
Switches
-c = use a "count between the start of the other move block and
the first line in the other file which matched this move
block" to determine moved blocks [DEFAULT]
-d = use a "drop" or relative slope to determine moved blocks
-m = use "mononotonically increasing by one" to determine moved blocks
-v = verbose (debugging)
-w = compress white space only on each line before comparison,
and remove leading white space (remwhite -a option).
(see CACM, April 78, "A Technique for Isolating Differences Between Files",
by Paul Heckel).
Output:
The output is identical in meaning to the output from UNIX diff,
except that a "move" command is present here, but not in diff.
DELETES
-------
old d new // Single line delete - Old line number 'old' is
// deleted after new line numbered 'new'
startold,endold d new // Block line delete - Old block of lines 'startold'
// to 'endold' are deleted after new line number 'new'
INSERTS
-------
old a new // After old line number 'old' is new line number 'new'
old a startnew,endnew // After old line number 'old', new lines numbered
// 'startnew' to 'endnew'
CHANGES
-------
old c new // Change one line to one new line. The old line
// numbered 'old' becomes new line numbered 'new'
old c startnew,endnew // Change one line to a block of lines. Old line
// numbered 'old' becomes the new set of lines.
startold,endold c new // Change a block of old lines to one new line.
startold,endold c startnew,endnew // Change a block of lines to
// a different block of lines.
MOVES
-----
old m new // Old line number 'old' is moved to new line
// number 'new'
startold,endold m startnew,endnew // The old block of lines have been moved
// and the old line numbers have changed.
For DELETES, INSERTS, and CHANGES (but not MOVES) the old line and new lines
are displayed as follows (same as the UNIX diff):
< old line
> new line
BUGS:
Hdiff is limited to files with at most 5000 lines per file.
To fix, recompile hdiff.c with a larger MAXLINES #define.
SHAR_EOF
if test 2433 -ne "`wc -c < 'hdiff.hlp'`"
then
echo shar: error transmitting "'hdiff.hlp'" '(should have been 2433 characters)'
fi
fi
echo shar: extracting "'Makefile'" '(1034 characters)'
if test -f 'Makefile'
then
echo shar: will not over-write existing file "'Makefile'"
else
cat << \SHAR_EOF > 'Makefile'
SRC = hdiff.c remwhite.c stripnl.c stripnl.h hdiff.mk
# also the hdiff help file is source but it is renamed on the copy
# change this for your site
INSTALLDIR = .
hdiff: hdiff.o remwhite.o stripnl.o
cc -O hdiff.o remwhite.o stripnl.o
mv a.out hdiff
hdiff.o: stripnl.h
cc -O -c hdiff.c
remwhite.o: stripnl.h
cc -O -USTAND -c remwhite.c
stripnl.o: stripnl.h
cc -O -c stripnl.c
clean:
rm -f hdiff.o remwhite.o stripnl.o hdiff
install: hdiff
cp hdiff $(INSTALLDIR)
# distribute hdiff. personal for dennis only.
dist:
rm -rf /tmp/dpb
mkdir /tmp/dpb
cp $(SRC) /tmp/dpb
cp ../help/hdiff /tmp/dpb/hdiff.hlp # help file
(cd /tmp/dpb; make -f hdiff.mk makeshar)
makeshar:
splitfiles * # split source files into little bundles
for i in list.* ; \
do \
makeshar `cat $$i` > shar.$$i ; \
done
# you must run make -f hdiff.mk makeshar first
# sends shar files to mod.sources
# hardcoded for 2 bundles
sendtonet:
for i in 1 2 ; \
do \
Mail < shar.list.$$i -s "hdiff: - part $$i of 2" sources at panda.uucp; \
done
SHAR_EOF
if test 1034 -ne "`wc -c < 'Makefile'`"
then
echo shar: error transmitting "'Makefile'" '(should have been 1034 characters)'
fi
fi
echo shar: extracting "'hdiff.c'" '(34748 characters)'
if test -f 'hdiff.c'
then
echo shar: will not over-write existing file "'hdiff.c'"
else
cat << \SHAR_EOF > 'hdiff.c'
/*
* f=hdiff.c (In honor of Mr Heckel, the guy who thought up this
* algorithm).
*
* author - dennis bednar 8 22 84
* Source file comparison program similar to UNIX diff, except this
* version outputs moved blocks whereas UNIX diff reports it as
* delete/add blocks.
*
* Algorithm from "A Technique for Isolating Differences Between Files"
* CACM, April 1978, by Paul Heckel.
* Some ideas for pass 6 were borrowed from "What's The Diff? -- A File
* Comparator for CP/M", Dr. Dobb's Journal, August 1984, by D.E. Cortesi.
* The borrowed idea was that when you are at the beginning of two
* blocks of lines which link with lines in the other file, but don't
* match, then the smaller ascending block is the "moved block".
*
* The pass5a () is coded directly from his his pass5 Pascal procedure.
* It doesn't work if the old line being looked up in the symbol table
* isn't unique!!! You would have to change the format of the linerec
* record to make it work. That's why it's commented out.
*
* Output
DELETES
-------
old d new // Single line delete - Old line number 'old' is
// deleted after new line numbered 'new'
startold,endold d new // Block line delete - Old block of lines 'startold'
// to 'endold' are deleted after new line number 'new'
INSERTS
-------
old a new // After old line number 'old' is new line number 'new'
old a startnew,endnew // After old line number 'old', new lines numbered
// 'startnew' to 'endnew'
CHANGES
-------
old c new // Change old line numbered 'old' to new line numbered
// 'new'
old c startnew,endnew // Change one line to a block of lines. Old line
// numbered 'old' becomes the new set of lines.
startold,endold c new // Change a block of old lines to one new line.
startold,endold c startnew,endnew // Change a block of lines to
// a different block of lines.
MOVES
-----
old m new // Old line number 'old' is moved to new line
// number 'new'
startold,endold m startnew,endnew // The old block of lines have been moved
// and the old line numbers have changed.
For DELETES, INSERTS, and CHANGES (but not MOVES) the old line and new lines
are displayed as follows:
< old line
> new line
*/
#include <stdio.h>
#include "stripnl.h"
/* choose SKIPCOUNT and BIGPRIME as follows:
* BIGPRIME > 2*MAXLINES in case both files have all unique lines
* such that SKIPCOUNT*tablesize probes will hit every slot in hash tbl
* (hint: use the hashtbl a.out to help you)
*/
#define MAXLINES 5000 /* max lines per file that can handle */
#define BIGPRIME 10001 /* ideally a prime number > 2*MAXLINES for hash into symtbl */
#define _MAXLINE1 MAXLINES+2 /* two extra for first, last sentinel */
#define LINESIZE 1024 /* max chars per line */
#define _LINESIZE LINESIZE+2 /* two extra for '\n', '\0' fgets rtns */
#define SKIPCOUNT 4 /* reprobe into hash symbol table */
#define CHANGESEP "---\n" /* line separator for change block */
#define ZAPFLAG 1 /* 1 = yes, remove leading white space */
/* structure for old, new files. One structure per line.
* An array oa for the old file, and an array na for the new file.
* If the line 'oldi' in the old file has NOT been matched to a line in
* the new file then oa[oldi].flag == L_SYMINDX.
* Similarly if the line 'newi' in the new file has NOT been matched to
* a line in the old file, then na[newi].flag == L_SYMINDX.
* When line 'oldi' in the old file has been matched to line 'newi'
* in the new file, then oa[oldi].flag == L_LINENUM,
* na[newi].flag == L_LINENUM, and each line points to the other, e.g.,
* oa[oldi].lineloc == newi, and na[newi].lineloc == oldi.
*/
struct linerec {
int lineloc; /* line number in other file or symtbl index */
int flag; /* tells which as defined below */
};
#define L_LINENUM 0
#define L_SYMINDX 1
/* symbol table structure. One per unique line in both files. */
/* the line is keyed into the array via a hash code */
struct symrec {
char *stashline; /* saved line in malloc'ed memory */
int ocount; /* count of lines in old file, 0,1,many */
int ncount; /* count of lines in new file, 0,1,many */
int olinenum; /* line number in the old file */
};
/*globals */
char *cmd, /* name of this command */
errbuf [_LINESIZE], /* build err msg for perror */
linebuf [_LINESIZE], /* read line from either file into here */
runflag [26], /* array of options flags set */
*oldfile, /* name of old file */
*newfile; /* name of new file */
FILE *oldfp, /* stream for old file */
*newfp; /* stream for new file */
struct linerec
oa [_MAXLINE1], /* for old file */
na [_MAXLINE1]; /* for new file */
struct symrec
symtbl [BIGPRIME]; /* symbol table of all lines seen */
int lastnew, /* total lines read from new file */
lastold, /* total lines in old file */
numsymbols, /* number of entries in symbol table */
debug; /* debug flag set by -d flag */
/* external non-integer functions */
FILE *fopen(); /* fopen (3) */
char *malloc (); /* malloc (3) */
char *remwhite (); /* remove white space from string */
/* internal non-integer functions */
unsigned int hashline ();
main (argc, argv)
int argc;
char **argv;
{
int usage; /* TRUE iff there is a usage error */
/* name of command */
cmd = argv [0];
/* check arguments and assign global file names */
usage = 0; /* assume no usage error */
debug = 0; /* assume no debugging wanted */
if (argc == 3)
{
oldfile = argv [1];
newfile = argv [2];
}
else if (argc == 4 && *argv[1] == '-')
{
register char *cp;
for (cp = argv[1]+1; *cp; ++cp)
if ('a' <= *cp && *cp <= 'z')
switch (*cp) {
case 'v': /* verbose - debug */
case 'd': /* drop */
case 'm': /* monotonical by 1 */
case 'c': /* count */
case 'w': /* white space */
if (*cp == 'v')
debug = 1;
runflag [*cp - 'a'] = 1;
break;
default:
usage = 1;
break;
}
else
usage = 1;
oldfile = argv[2];
newfile = argv[3];
}
else
usage = 1;
if (usage)
{
fprintf (stderr, "usage: %s [-cdmvw] oldfile newfile\n", cmd);
exit (1);
}
/* open both files */
openfiles ();
initvars ();
src_compare ();
/* close both files */
closefiles ();
}
initvars ()
{
numsymbols = 0;
}
/*
* compare the 2 files in 6 easy passes
*/
src_compare ()
{
pass1 (); /* read, store new file */
pass2 (); /* read, store old file */
pass3 (); /* match up lines which occur only once */
pass4 (); /* apply rule 2 working toward end */
pass5 (); /* apply rule 2 working toward beginning */
/* pass5a (); /* convert block-moves to insert/deletes */
/* dumparray (); /* see if internal tables are built correctly */
pass6 (); /* print out differences */
if (debug)
printf ("debug: symbol table: occupied = %d, total = %d\n",
numsymbols, BIGPRIME);
}
/*
* pass 1
* read in new file.
*/
pass1 ()
{
int
linenum, /* which line we are on */
stat, /* result from stripnl */
sindex; /* index into symbol table */
char *cp; /* char pointer into line */
/* read each line of new file in a loop. */
/* stop if eof or can't handle that many lines */
for (linenum = 0;
fgets (linebuf, sizeof(linebuf), newfp) != (char *)NULL &&
++linenum <= MAXLINES; )
{
/* strip newline at end and make sure line wasn't too long */
stat = stripnl (linebuf, sizeof(linebuf) );
if (stat == L_SUCCESS)
;
else if (stat == L_BADFORM)
fprintf (stderr, "%s: Warning, line %d in file %s not terminated with newline\n",
cmd, linenum, newfile);
else
{
fprintf (stderr, "%s: line %d longer than %d chars in file %s\n",
cmd, linenum, LINESIZE, newfile);
exit (1);
}
/* if compressing white space, then compress into line buf */
if (runflag ['w' - 'a'] )
{
char *cp;
cp = remwhite (linebuf, ZAPFLAG);
strcpy (linebuf, cp);
}
sindex = addsymbol (linebuf); /* put line into symbol tbl */
if (symtbl [sindex].ncount < 2)
++symtbl [sindex].ncount;
na [linenum].lineloc = sindex;
na [linenum].flag = L_SYMINDX;
}
/* see if new file was empty */
if (linenum == 0)
{
fprintf (stderr, "%s: New file %s is empty. Diff is delete entire old file.\n", newfile, cmd);
exit (1);
}
if (linenum > MAXLINES)
{
fprintf (stderr, "%s: New file %s is too big. Last line read was %d.\n", cmd, newfile, MAXLINES);
exit (1);
}
/* assign global number of new lines */
lastnew = linenum;
}
/*
* pass 2
* read in old file.
*/
pass2 ()
{
int
linenum, /* which linenum we are on */
stat, /* result from stripnl */
sindex; /* index into symbol table */
char *cp; /* char pointer into line */
/* read each line of old file in a loop. */
/* stop if eof or can't handle that many lines */
for (linenum = 0;
fgets (linebuf, sizeof(linebuf), oldfp) != (char *)NULL &&
++linenum <= MAXLINES; )
{
#if 0 /* old way */
/* strip newline at end and make sure line wasn't too long */
cp = &linebuf[strlen(linebuf)-1];
if (*cp == '\n')
*cp = '\0';
else if (strlen(linebuf) < sizeof(linebuf)-1)
fprintf (stderr, "%s: Warning, line %d in file %s not terminated with newline\n", cmd, linenum, oldfile);
else
{
fprintf (stderr, "%s: line %d longer than %d chars in file %s\n",
cmd, linenum, LINESIZE, oldfile);
exit (1);
}
#endif
/* strip newline at end and make sure line wasn't too long */
stat = stripnl (linebuf, sizeof(linebuf) );
if (stat == L_SUCCESS)
;
else if (stat == L_BADFORM)
fprintf (stderr, "%s: Warning, line %d in file %s not terminated with newline\n",
cmd, linenum, oldfile);
else
{
fprintf (stderr, "%s: line %d longer than %d chars in file %s\n",
cmd, linenum, LINESIZE, oldfile);
exit (1);
}
/* if compressing white space, then compress into line buf */
if (runflag ['w' - 'a'] )
{
char *cp;
cp = remwhite (linebuf, ZAPFLAG);
strcpy (linebuf, cp);
}
sindex = addsymbol (linebuf); /* put line into symbol tbl */
if (symtbl [sindex].ocount < 2)
++symtbl [sindex].ocount;
symtbl[sindex].olinenum = linenum;
oa [linenum].lineloc = sindex;
oa [linenum].flag = L_SYMINDX;
}
/* see if old file was empty */
if (linenum == 0)
{
fprintf (stderr, "%s: Old file %s is empty. Diff is add entire new file.\n", oldfile, cmd);
exit (1);
}
if (linenum > MAXLINES)
{
fprintf (stderr, "%s: Old file %s is too big. Last line read was %d.\n", cmd, oldfile, MAXLINES);
exit (1);
}
/* assign number of lines in old file */
lastold = linenum;
}
/*
* Add a line to the symbol table.
* The hash function determines the initial probe into the symbol table.
* If the line is new, then core is malloc'ed for storage, and
* the 'value' is the pointer to the line (empty slot can be found).
* If the line is old, the old index is returned.
*
* returns
* index into symbol table
* updated global 'numsymbols' for debugging.
*
*/
addsymbol (line)
char *line;
{
int sindex, /* symbol table index */
firstprobe; /* index of first probe into symbol tbl */
char *cp; /* malloc char pointer */
sindex = firstprobe = hashline (line, BIGPRIME);
/* keep looping until empty slot found or table is full. */
/* table cannot be full. */
while (1)
{
/* if find an empty slot, add the line there. */
if (symtbl [sindex].stashline == (char *)NULL)
{
cp = malloc (strlen (linebuf) + 1);
if (cp == (char *)NULL)
{
fprintf (stderr, "%s: out of space in function addsymbol\n", cmd);
exit (1);
}
/* copy the line to storage area */
strcpy (cp, line);
/* save the line in the symbol table entry */
symtbl [sindex].stashline = cp;
++numsymbols;
break;
}
/* if the line is already in the table, then done */
else if (strcmp (symtbl [sindex].stashline, line) == 0)
break;
/* else it's a collision with a different line, so
* let's do linear open addressing. That is, reprobe
* SKIPCOUNT slots below, modulo the table size.
* SKIPCOUNT == 1 means primaray clustering.
* SKIPCOUNT > 1 means secondary clustering.
* Check for overflow.
*/
else
{
sindex += SKIPCOUNT;
if (sindex >= BIGPRIME)
sindex = 0; /* wraparound */
if (sindex == firstprobe)
{
fprintf (stderr, "%s: symtbl overflow!\n", cmd);
exit (1);
}
}
}
return sindex;
}
/*
* pass 3.
* Process NA array and link lines up which appear only once
* in both files.
* Link virtual line pointer at front and end of both files.
*/
pass3 ()
{
int newi, /* loop counter thru na array */
oldi, /* index into oa array */
symi; /* index into symbol table array */
/* loop thru all new lines. Those that occur once are linked to
* each other instead of to the symbol table as they used to be.
*/
for (newi = 1; newi <= lastnew; ++newi)
{
symi = na[newi].lineloc; /* get symbol tbl index */
if ((symtbl[symi].ocount == 1) && (symtbl[symi].ncount == 1))
{
oldi = symtbl[symi].olinenum;
linkup (oldi, newi);
}
}
/* link the virtual line 'begin' of each file to each other */
linkup (0, 0);
/* line the virtual line 'end' of each file to each other */
linkup (lastold+1, lastnew+1);
}
/*
* pass 4
* loop ascending thru new lines in na array.
* If line newi in new file is linked to line oldi in old file,
* but the next lines of each contain the same symbol table
* pointer, then link na[newi+1] to oa[oldi+1].
*/
pass4 ()
{
int newi,
oldi;
/* process 'begin', all lines in new file, but NOT virtual 'end' */
for (newi = 0; newi <= lastnew; ++newi)
{
if (na[newi].flag == L_LINENUM)
{
oldi = na[newi].lineloc;
if ((na[newi+1].flag == L_SYMINDX) &&
(oa[oldi+1].flag == L_SYMINDX) &&
(na[newi+1].lineloc == oa[oldi+1].lineloc))
linkup (oldi+1, newi+1);
}
}
}
/*
* pass 5
* Process new file in descending order.
* Similar to pass 4, but in reverse order.
*/
pass5 ()
{
int newi,
oldi;
/* process 'end', all lines in new file, but NOT virtual 'begin' */
for (newi = lastnew+1; newi > 0; --newi)
{
if (na[newi].flag == L_LINENUM)
{
oldi = na[newi].lineloc;
if ((na[newi-1].flag == L_SYMINDX) &&
(oa[oldi-1].flag == L_SYMINDX) &&
(na[newi-1].lineloc == oa[oldi-1].lineloc))
linkup (oldi-1, newi-1);
}
}
}
/*
* pass 6
* output the differences.
*/
pass6 ()
{
int oldi, /* current line number in old file */
newi, /* current line number in new file */
oldmatch, /* true iff old line matches SOME line in new file */
newmatch, /* true iff new line matches SOME line in old file */
omatcount, /* match count in old file */
nmatcount, /* match count in new file */
odrop, /* old file "relative drop" of new line */
ndrop, /* new file "relative drop" of old line */
ocomp, /* comparison result for old file */
ncomp, /* comparison result for new file */
ospan, /* size of old monotonical block */
nspan; /* size of new monotonical block */
for (oldi = 0, newi = 0; oldi <= lastold+1 && newi <= lastnew+1;)
{
/* set flags to indicate if each line is linked
* to another line in the other file.
*/
/* is old line linked to another line in the new file? */
oldmatch = (oa[oldi].flag == L_LINENUM);
/* is new line linked to another line in the old file? */
newmatch = (na[newi].flag == L_LINENUM);
/* if old line moved from old file up (toward begin) in
* new file, then we have already processed the block
* move in the new file. Just skip the line in the old file.
*/
if (oa[oldi].lineloc < newi && oldmatch)
{
++oldi;
continue;
}
/* if old line moved from old file down (toward end) in
* old file, then we we have already processed the block
* move in the old file. Just skip the line in the new file.
*/
else if (na[newi].lineloc < oldi && newmatch)
{
++newi;
continue;
}
/* there are four combinations of booleans oldmatch and newmatch */
/* if both lines are linked to SOME line in the other file */
if (oldmatch && newmatch)
{
/* if both lines match each other, then go
* to next line in both files
*/
if (oa[oldi].lineloc == newi)
{
if (debug)
printf ("debug: Same old line %d and new line %d\n", oldi, newi);
++oldi;
++newi;
continue;
}
/* blocks not linked to each other.
* If there are more lines in oldfile that
* are monotonically increasing by one in
* the new file, then this is normal, and the
* new block is moved down in the old file.
* If there are more lines in newfile that
* are monotonically increasing by one in
* the old file, then this is normal, and
* the old block is moved down in the new file.
*/
/* get size of old block which is monitonically
* increasing by one in the new file.
*/
ospan = oldmon (oldi);
/* get size of new block which is monitonically
* increasing by one in the old file.
*/
nspan = newmon (newi);
/* count the number of lines in the old file between
* the current line and the line which corresponds
* to the first new moved line.
*/
omatcount = gomatch (oldi, na[newi].lineloc);
/* count the number of lines in the new file between
* the current line and the old moved line which
* match.
*/
nmatcount = gnmatch (newi, oa[oldi].lineloc);
/* get number of old lines the new block drops down
* into the old file. If old drop is < new drop
* then the old block has moved down further into
* the new file than the new block has moved down
* into the old file. The relative steepness of the
* slope of the line from oldi to its matched line
* is more than the steepnes of the other slope,
* and since line don't usually move far, assume
* that the new to old move is a match, and the
* old to new is an old block moved down.
*/
odrop = na [newi].lineloc - oldi;
/* same for new file */
ndrop = oa [oldi].lineloc - newi;
/* if 'd' flag is set, use 'drop' for match */
if (runflag ['d' - 'a'])
{
ocomp = odrop;
ncomp = ndrop;
}
/* if 'm' flag set, use bigger monotonically increasing
* by 1 span
*/
else if (runflag ['m' - 'a'])
{
ocomp = ospan;
ncomp = nspan;
}
/* if 'c' flag then use match count */
else if (runflag ['c' - 'a'])
{
ocomp = omatcount;
ncomp = nmatcount;
}
/* default - best one */
else
{
ocomp = omatcount;
ncomp = nmatcount;
}
if (ocomp < ncomp) /* old block moved down */
eatold (&oldi, ospan); /* eat old mv'ed blk */
else /* new block moved down */
eatnew (&newi, nspan); /* eat new mv'ed blk */
}
/* new lines added (inserted) into new file */
else if (oldmatch && !newmatch)
newinsert (oldi, &newi); /* skip new insert blk */
/* old lines deleted from old file */
else if (!oldmatch && newmatch)
olddelete (&oldi, newi); /* skip old delete block */
/* old lines changed into new file */
else /* !oldmatch && !newmatch */
oldchange (&oldi, &newi); /* skip old delete block and new insert block */
}
}
/*
* block of old lines beginning at linenum 'oldi' are matched with
* a block of new lines. Count the number of lines in the old block
* which are monotonically increasing by one in the new file.
* returns -
* size of old block
*/
oldmon (oldi)
int oldi;
{
int osize, /* size of old block */
expnew, /* line number expected in new file */
curoldnum; /* current line number in old file */
curoldnum = oldi; /* save old line number so can tell
* how big the old block is at the end
*/
do
{
expnew = oa[curoldnum].lineloc + 1;
++curoldnum;
}
while ((curoldnum <= lastold+1) /* in bounds */
&& (expnew == oa[curoldnum].lineloc) /* monotonical by 1 */
&& (oa[curoldnum].flag == L_LINENUM)); /* match */
osize = curoldnum - oldi;
if (debug)
printf ("debug: lines %d-%d of old file are monotonical\n", oldi, curoldnum-1);
return osize;
}
newmon (newi)
int newi;
{
int nsize, /* size of new block */
expold, /* line number expected in old file */
curnewnum; /* current line number in new file */
curnewnum = newi; /* save new line number so can tell
* how big the new block is at the end
*/
do
{
expold = na[curnewnum].lineloc + 1;
++curnewnum;
}
while ((curnewnum <= lastnew+1) /* in bounds */
&& (expold == na[curnewnum].lineloc) /* monotonical by 1 */
&& (na[curnewnum].flag == L_LINENUM)); /* match */
nsize = curnewnum - newi;
if (debug)
printf ("debug: lines %d-%d of new file are monotonical\n", newi, curnewnum-1);
return nsize;
}
/*
* eat old block beginning with old line number *'oldi' and 'ospan' lines
* These old lines are moved down into the new file.
* The output format is
5,6m7,8 // old lines 5-6 are moved to new lines 7-8
5m7 // old line 5 is moved to new line 7
* returns
* *oldi = updated old line number
*/
eatold (oldi, ospan)
int *oldi,
ospan;
{
int newi;
/* get starting line number of new block */
newi = oa[*oldi].lineloc;
/* how many lines in the old block - one or more? */
if (ospan == 1)
printf ("%dm%d\n", *oldi, newi);
else if (ospan > 1)
printf ("%d,%dm%d,%d\n", *oldi, *oldi+ospan-1, newi, newi+ospan-1);
else
{
fprintf (stderr, "%s: unexpected negative ospan = %d in function eatold\n", ospan);
exit (1);
}
/* return old line number after the old moved block */
*oldi += ospan;
}
/*
* eat the new block
* Like eatold, but works on the new block.
*/
eatnew (newi, nspan)
int *newi,
nspan;
{
int oldi;
/* get starting line number in the old file */
oldi = na[*newi].lineloc;
if (nspan == 1)
printf ("%dm%d\n", oldi, *newi);
else if (nspan > 1)
printf ("%d,%dm%d,%d\n", oldi, oldi+nspan-1, *newi, *newi+nspan-1);
else
{
fprintf (stderr, "%s: unexpected negative nspan = %d in function eatnew\n", nspan);
exit (1);
}
*newi += nspan;
}
/*
* Print new lines inserted in old file to transform into new file
* input
* oldi - one after the current old line number !!!!!!!
* *newi - first line number in new file of inserted line.
* output
* *newi - next line number in new file after inserted block.
* printed lines of the form
*
5a7 // after old line 5 add one newline as newline 7
> new line 8
* or
5a7,8 // after old line 5 add two newlines as newline 7 and 8
> new line 7
> new line 8
*/
newinsert (oldi, newi)
int oldi,
*newi;
{
int nsize, /* number of lines inserted in new file */
csize, /* current number of lines in a loop */
curnewline; /* current line number in new file */
if (debug)
printf ("debug: Found New Insert. Old line %d. New line %d.\n", oldi, *newi);
/* first compute size so we know which format to use (single line
* insert or multiline insert)
* Get the number of lines in the insert block.
*/
nsize = inssize (*newi);
if (nsize == 1)
printf ("%da%d\n", oldi-1, *newi);
else if (nsize > 1)
printf ("%da%d,%d\n", oldi-1, *newi, *newi+nsize-1);
else
{
fprintf (stderr, "%s: unexpected new block size = %d in function newinsert\n", nsize);
exit (1);
}
/* print the new inserted lines */
for (curnewline = *newi, csize = nsize; csize > 0; ++curnewline, --csize)
printf ("> %s\n", symtbl [ na[curnewline].lineloc ].stashline);
/* return the new line */
*newi += nsize;
}
/*
* Print that a group of lines in the old file was deleted
* input -
* *oldi - line number in old file where the group of lines begins
* newi - line number + 1 in new file after which the delete occurs.
* output
* *oldi - updated with the next old line after the delete block
5d7 // old line 5 deleted after new line 7
< old line 5
5,6d7 // old lines 5 and 6 deleted after new line 7
< old line 5
< old line 6
*
*/
olddelete (oldi, newi)
int *oldi,
newi;
{
int osize, /* number of lines deleted in old file */
csize, /* current size in in the loop */
curoldline; /* current line number in old file */
if (debug)
printf ("debug: Found Old Delete. Old line %d. New line %d.\n", *oldi, newi);
/* first compute size so we know which format to use (single line
* insert or multiline delete)
* Get the number of lines in the old delete block.
*/
osize = delsize (*oldi);
/* print the header for the line deletes */
if (osize == 1)
printf ("%dd%d\n", *oldi, newi-1);
else if (osize > 1)
printf ("%d,%dd%d\n", *oldi, *oldi+osize-1, newi-1);
else
{
fprintf (stderr, "%s: unexpected old block size = %d in function olddelete\n", osize);
exit (1);
}
/* print the old deleted lines */
for (curoldline = *oldi, csize = osize; csize > 0; ++curoldline, --csize)
printf ("< %s\n", symtbl [ oa[curoldline].lineloc ].stashline);
/* return the old line after the delete block */
*oldi += osize;
}
/*
* Print that a group of old lines has been updated to a group of new lines.
* input
* *oldi = line number of start of old lines
* *newi = line number of start of new lines
* output
* *oldi = line number after block deleted in the old file
* *newi = line number after block inserted in the new file
* print statements according to one of 4 forms:
startold c startnew
startold c startnew,endnew
startold,endold c startnew
startold,endold c startnew,endnew
* Examples
5c7 // old line 5 has been replace by new line 7
< old line 5
---
> new line 7
5,6c7,8 // old lines 5 and 6 have been replaced by new lines 7 and 8
< old line 5
< old line 6
---
> new line 7
> new line 8
*/
oldchange (oldi, newi)
int *oldi,
*newi;
{
int curold, /* current line number in old file */
curnew, /* current line number in new file */
osize, /* size of block changed in old file */
nsize, /* size of block changed in new file */
csize, /* current block size for the loop */
curoldline, /* current line number in old file */
curnewline; /* current line number in new file */
if (debug)
printf ("debug: Found Changed Lines. Old line %d. New line %d\n", *oldi, *newi);
/* get the size of the old deleted block and the new inserted block */
osize = delsize (*oldi);
nsize = inssize (*newi);
/* print the header for the old deleted block */
if (osize == 1)
printf ("%dc", *oldi);
else if (osize > 1)
printf ("%d,%dc", *oldi, (*oldi)+osize-1);
else
{
fprintf (stderr, "%s: unexpected old block size = %d in function oldchange\n", osize);
exit (1);
}
/* print the header for the new inserted block */
if (nsize == 1)
printf ("%d\n", *newi);
else if (nsize > 1)
printf ("%d,%d\n", *newi, (*newi)+nsize-1);
else
{
fprintf (stderr, "%s: unexpected new block size = %d in function oldchange\n", nsize);
exit (1);
}
/* Now print the old changed (delete) lines */
for (curoldline = *oldi, csize = osize; csize > 0; ++curoldline, --csize)
printf ("< %s\n", symtbl [ oa[curoldline].lineloc ].stashline);
/* print line change separator */
printf (CHANGESEP);
/* Now print the new changed (inserted) lines */
for (curnewline = *newi, csize = nsize; csize > 0; ++curnewline, --csize)
printf ("> %s\n", symtbl [ na[curnewline].lineloc ].stashline);
/* return the new old and new lines */
*oldi += osize;
*newi += nsize;
}
/*
* these next two routines are called upon by functions
* olddelete, newinsert, and oldchange
*/
/*
* starting with line number 'newi' in the new file, count the number of
* lines in the insert block.
*
* Loop thru the insert block. The block is a series of new lines
* that aren't linked to lines in the old file. That is, each
* inserted line points to a symbol table entry.
*/
inssize (newi)
int newi;
{
register
int curnewline, /* current line number in new file */
nsize; /* number of new lines in insert block */
curnewline = newi;
while (curnewline <= lastnew+1 && na[curnewline].flag == L_SYMINDX)
++curnewline;
/* curnewline is now the new line number AFTER the insert block */
/* compute the number of lines in the insert block */
nsize = curnewline - newi;
if (debug)
printf ("debug: at new line %d, the insert block has %d lines\n", newi, nsize);
return nsize;
}
/*
* compute the number of lines in the old delete block beginning
* with old line number 'oldi' and return it.
*
* Loop thru the delete block. The block is a series of old lines
* that aren't linked to lines in the new file. That is, each
* deleted line points to a symbol table entry.
*/
delsize (oldi)
int oldi;
{
register
int curoldline, /* current line number in old file */
osize; /* number of old lines in delete block */
curoldline = oldi;
while (curoldline <= lastold+1 && oa[curoldline].flag == L_SYMINDX)
++curoldline;
/* curoldline is now the old line number AFTER the delete block */
/* compute the number of lines in the delete block */
osize = curoldline - oldi;
if (debug)
printf ("debug: at old line %d, the delete block has %d lines\n", oldi, osize);
/* return the delete block size */
return osize;
}
/*
* dump the oa and na internal arrays which is the crudest way
* to print the file comparisons.
* This routine may be called in lieu of pass 6 as a debugging tool.
*/
dumparray ()
{
int oldi,
newi;
for (oldi = 0; oldi <= lastold+1; ++oldi)
if (oa[oldi].flag == L_LINENUM)
printf ("oa[%d] = %d\n", oldi, oa[oldi].lineloc);
else
printf ("oa[%d] INSERTED.\n", oldi);
for (newi = 0; newi <= lastnew+1; ++newi)
if (na[newi].flag == L_LINENUM)
printf ("na[%d] = %d\n", newi, na[newi].lineloc);
else
printf ("na[%d] DELETED.\n", newi);
}
/* borrowed from hashname.c
* eventually will be removed.
*/
unsigned int hashline (name,modval)
register char *name;
register unsigned int modval;
{
register unsigned int i;
i=0;
while (*name != '\0'){
i=((i<<2)+(*name&~040))%modval;
name++;
}
return(i);
}
/*
* linkup line oldi in old file to line newi' in new file
*/
linkup (oldi, newi)
int oldi,
newi;
{
oa[oldi].lineloc = newi;
oa[oldi].flag = L_LINENUM;
na[newi].lineloc = oldi;
na[newi].flag = L_LINENUM;
}
/*
* pass 5a converts block moves to delete/insert pairs
*/
pass5a ()
{
int oldi,
newi;
for (oldi = 1, newi = 1; oldi <= lastold+1; )
{
while (oa[oldi].flag == L_SYMINDX && oldi <= lastold+1)
++oldi; /* skip deletes in old file */
while (na[newi].flag == L_SYMINDX && newi <= lastnew+1)
++newi; /* skip inserts in new file */
if (oldi > lastold+1)
break;
if (newi > lastnew+1)
break;
if (oa[oldi].lineloc == newi) /* begin matching lines */
{
oldi++;
newi++;
}
else /* discontinuity ?*/
{
if (oa[oldi].lineloc != lastnew+1)
resolve (oldi, newi); /* yes */
else
; /* no, sentinel */
}
}
}
resolve (oldi, newi)
int oldi,
newi;
{
int xo, xn;
int t, ospan, nspan;
int symi;
/* measure block starting at oa[oldi] */
xo = oldi;
do
{
t = 1 + oa[xo].lineloc;
xo++;
}
while (t != oa[xo].lineloc);
ospan = xo - oldi;
/* measure block starting at na[newi] */
xn = newi;
do
{
t = 1 + na[xn].lineloc;
xn++;
}
while (t != na[xn].lineloc);
nspan = xn - newi;
if (ospan < nspan)
{
xo = oldi;
xn = oa[oldi].lineloc;
t = ospan;
oldi = oldi + ospan;
}
else
{
xn = newi;
xo = na[newi].lineloc;
t = nspan;
newi = newi + nspan;
}
while (t > 0)
{
symi = 0;
while (symi < BIGPRIME && symtbl[symi].olinenum != xo)
symi++;
if (symi >= BIGPRIME)
{
fprintf (stderr, "%s: can't find old line %d in symtbl\n", cmd, xo);
dumpsym ();
exit (1);
}
/* link the lines to the same symbol tbl entry */
oa[xo].flag = L_SYMINDX;
oa[xo].lineloc = symi;
na[xn].flag = L_SYMINDX;
na[xn].lineloc = symi;
++xo;
++xn;
--t;
}
}
/*
* open global files 'oldfile' and 'newfile'
*/
openfiles ()
{
/* open old source file for reading */
oldfp = fopen (oldfile, "r");
if (oldfp == (FILE *)NULL)
{
sprintf (errbuf, "%s: can't oldfile open %s", cmd, oldfile);
perror (errbuf);
exit (1);
}
/* open new source file for reading */
newfp = fopen (newfile, "r");
if (newfp == (FILE *)NULL)
{
sprintf (errbuf, "%s: can't open newfile %s", cmd, newfile);
perror (errbuf);
exit (2);
}
}
/*
* close both files
*/
closefiles ()
{
fclose (oldfp);
fclose (newfp);
}
/*
* dump contents of symbol table
* For now, only the old line number and the line of text.
*/
dumpsym ()
{
register int i;
printf ("Symbol Table Contents:\n");
printf ("------ ----- --------\n");
printf ("old_line_num: <line of text>\n");
for (i = 0; i < BIGPRIME; ++i)
if (symtbl[i].olinenum != 0)
printf ("%d: %s\n", symtbl[i].olinenum, symtbl[i].stashline);
}
/*
* get number of old lines which match new lines.
*
* return the number of lines number n in the old file such that
* startline <= n < endline which are linked to lines in the new
* file, and are strictly monotonically increasing in the new file (doesn't
* have to be monotonically increasing by one, but it can't be the same).
* Once the monotonical increase stops, we stop counting.
* 'startline' is definitely matched to some line in new file.
*/
gomatch (startline, endline)
int startline,
endline;
{
register int count, /* lines counted in old file */
newi, /* current new line number last matched */
oldi; /* current old line number */
/* stop counting lines if reach endline or find non-monitonical
* increase in the new file. newi set to zero is virtual line begin.
*/
for (oldi = startline, count = 0, newi = 0;
oldi < endline && oldi <= lastold+1; ++oldi)
if (oa[oldi].flag != L_LINENUM) /* old line doesn't match any new line */
continue; /* so skip old line */
else if (oa[oldi].lineloc > newi) /* monotonical? */
{
newi = oa[oldi].lineloc; /* yes, save new line number */
++count;
}
else
break; /* no, stop counting */
return count;
}
/*
* same idea as gomatch, except count lines in new file which match lines in
* old file and old file lines are strictly monotonically increasing
*/
gnmatch (startline, endline)
int startline,
endline;
{
register int count,
oldi,
newi;
/* stop counting lines if reach endline or find non-monitonical
* increase in the old file. oldi set to zero is virtual line begin.
*/
for (newi = startline, count = 0, oldi = 0;
newi < endline && newi <= lastnew+1; ++newi)
if (na[newi].flag != L_LINENUM) /* new line doesn't match any new line */
continue; /* so skip new line */
else if (na[newi].lineloc > oldi) /* monotonical? */
{
oldi = na[newi].lineloc; /* yes, save old line number */
++count; /* bump count */
}
else
break; /* no, stop counting */
return count;
}
SHAR_EOF
if test 34748 -ne "`wc -c < 'hdiff.c'`"
then
echo shar: error transmitting "'hdiff.c'" '(should have been 34748 characters)'
fi
fi
echo shar: extracting "'remwhite.c'" '(5649 characters)'
if test -f 'remwhite.c'
then
echo shar: will not over-write existing file "'remwhite.c'"
else
cat << \SHAR_EOF > 'remwhite.c'
/*
* f=remwhite.c
* author - dennis bednar 8 30 84
*
* library and standalone routine to remove excess white space from a string.
* If there are multiple white spaces together they are transformed into
* one blank character. New lines (if any) in the string are not touched.
* A string can consist of more than one line (ie multiple '\n' newline
* separators in the string), but usually a string will
* consist of one line followed by newline, and then terminated.
*
* White space at the beginning of each line in the string is reduced to
* one blank, if rem1stwhite=0;
* White space at the beginning of each line in the string is reduced to
* no-blanks, if rem1stwhite=1;
* White space for between 2nd, 3rd, etc. non-whites is in each line is
* reduced to one blank.
* White space at the end of the line in the string is REMOVED.
*/
#include <stdio.h>
#include "stripnl.h"
char *fgets ();
/* max strlen (number of chars) returned by remwhite */
#define MAXSTRING 1024
/* max line length that main can handle */
#define LINESIZE 1024 /* user sees max line length */
#define _LINESIZE LINESIZE+2 /* declare buffer size room for "\n\0" at end*/
/* last char at the end is for the NULL terminator */
/* this is the output buffer returned by remwhite() */
static char obuffer [MAXSTRING+1];
char *
remwhite (inbuf, rem1stwhite)
char *inbuf;
char rem1stwhite; /* 0 = leave leading white space if any */
/* 1 = remove leading white space if any */
{
register char *src, /* current pointer into input buf */
*dst, /* current pointer into output buf */
*end; /* first char AFTER end of output buffer */
int inwhite; /* true iff in middle of white space */
inwhite = 0;
/* initialize the src and dest ptrs */
src = inbuf;
dst = obuffer;
nextline:
/* skip over leading white space in this 'line' in input buffer */
/* DONT treat newlines as white space, otherwise, if there were
* multiple lines in the input string, then we would remove them,
* and we don't want to.
* src is positioned at the beginning of a line within inbuf.
*/
if (rem1stwhite)
for (; *src; ++src)
if (*src == ' ' || *src == '\t')
continue;
else
break; /* found first non-white input char */
/* this logic is implemented to output the blank char AFTER
* an inwhite to !inwhite state transition.
* The reason is so that white space at the end
* of the string will be removed.
*/
for (end = &obuffer[sizeof(obuffer)]; *src; ++src)
{
if (*src == ' ' || *src == '\t')
inwhite = 1; /* don't output anything, just remember we're seeing white space */
else if (inwhite) /* transition, time to output the old blank */
{
inwhite = 0;
*dst++ = ' ';
if (dst >= end)
goto error;
*dst++ = *src;
if (*src == '\n') /* found end of a line in the input buffer */
goto nextline; /* work on beginning of next line */
}
else
*dst++ = *src;
/* prevent output buffer overflows */
if (dst >= end)
{
error:
fprintf (stderr, "remwhite: overran buffer. More than %d chars.\n", MAXSTRING);
exit (5);
}
}
/* terminate the string */
*dst = '\0';
return obuffer;
}
#ifdef STAND
/*
* test out the remwhite function
*/
static char *cmd; /* name of this command */
/*
* usage:
cmd [-a] [file ...] # -a means remove leading white space in each line
# if no file(s) given, use stdin
*/
main (argc, argv)
int argc;
char **argv;
{
register int i;
int striplead = 0; /* default is DONT zap leading blanks */
/* ie default is to keep leading white space */
/* 1 would change " 1 2" to "1 2" */
/* 0 would leave " 1 2" as " 1 2" */
cmd = argv [0];
/* first process possible options */
for (i = 1; i < argc; ++i)
if (strcmp(argv[i], "-a") == 0)
{
striplead = 1;
continue;
}
else
break;
/* i now is index of first file in arg list, if any */
if (i == argc)
dofile ("", striplead); /* read from stdin */
else for ( ;i < argc; ++i) /* read each file */
dofile (argv[i], striplead);
exit (0); /* normal */
}
/*
* process a file
*/
dofile (filename, zapflag)
char *filename;
int zapflag; /* 1 = zap leading white space */
{
FILE *infp, /* input file */
*fopen (); /* fopen (3) */
if (*filename == '\0')
{
filename = "[stdin]"; /* in case we need to print file name */
infp = stdin;
}
else
{
infp = fopen (filename, "r");
if (infp == (FILE *)NULL)
{
sprintf (obuffer, "%s: can't open %s", cmd, filename);
perror (obuffer);
exit (2);
}
}
dolines (infp, filename, zapflag);
fclose (infp);
}
/*
* process the lines
*/
dolines (infp, filename, zapflag)
FILE *infp; /* stream pointer to read the input file */
char *filename; /* name of file we are reading from (for msg) */
int zapflag; /* 1 = zap leading white space */
{
char buffer [_LINESIZE], /* hold line from stdin here */
*cp; /* get pointer to compressed line */
int stat, /* status after stripping newline */
linenum; /* current line number we are reading */
/* read lines from infp and take out the blanks */
/* print them to show the effect */
linenum = 0;
while (fgets (buffer, sizeof(buffer), infp) != (char *) NULL)
{
++linenum;
stat = stripnl (buffer, sizeof(buffer) );
if (stat == L_SUCCESS)
; /* okay */
else if (stat == L_BADFORM)
fprintf (stderr, "%s: Warning, line %d in file %s not terminated by newline.\n", cmd, linenum, filename);
else
{
fprintf (stderr, "%s: Line %d in file %s longer than %d chars\n", cmd, linenum, filename, LINESIZE);
exit (4);
}
cp = remwhite (buffer, zapflag);
printf ("%s\n", cp);
}
}
#endif
SHAR_EOF
if test 5649 -ne "`wc -c < 'remwhite.c'`"
then
echo shar: error transmitting "'remwhite.c'" '(should have been 5649 characters)'
fi
fi
echo shar: extracting "'stripnl.c'" '(1745 characters)'
if test -f 'stripnl.c'
then
echo shar: will not over-write existing file "'stripnl.c'"
else
cat << \SHAR_EOF > 'stripnl.c'
/*
* f=stripnl.c
* author - dennis bednar 8 31 84
*
* this routine is designed to work in conjunction with fgets ().
* In this case, the L_BUFOVER error should never be returned, but
* the check is there anyway!
*
* strip new lines by converting the newline in the string to a null
* returns
* L_BUFOVER = buffer is overfilled. For example if bufsize = 10, then
* the string len could be at most 9 (so the NULL fits)
* L_TOOLONG = no newline was found (line too long) and buffer was filled.
* For a bufsize of 10, buffer[9] is the last char, and it's
* not the expected newline.
* L_BADFORM = no newline was found and buffer not filled, so bad format
* L_SUCCESS = success
*/
#include "stripnl.h"
stripnl (buffer, bufsize)
char *buffer;
int bufsize;
{
char *cp;
int len; /* number of chars in string */
/* get string length */
len = strlen (buffer);
/* make sure it's not already overrun */
if (len >= bufsize)
return (L_BUFOVER);
/* if "" is passed, len is zero, and so 'last char' doesn't exist */
if (len <= 0)
return (L_BADFORM);
/* now the string definitely fits in the buffer */
/* get pointer to last char in string */
cp = &buffer [len - 1];
/* if the last char of the string is a newline, change it to a NULL */
if (*cp == '\n')
{
*cp = '\0';
return L_SUCCESS;
}
/* now the string fits in the buffer, and the last char != '\n' */
/* line too long if the string length is the buffer size minus
* one for the null.
*/
else if (len >= bufsize-1)
return L_TOOLONG;
/* badly formatted if the line fits in the buffer, but
* contains no newline at the end.
* This happens on a file which is corrupted by, say,
* a transmission error.
*/
else
return L_BADFORM;
}
SHAR_EOF
if test 1745 -ne "`wc -c < 'stripnl.c'`"
then
echo shar: error transmitting "'stripnl.c'" '(should have been 1745 characters)'
fi
fi
echo shar: extracting "'stripnl.h'" '(125 characters)'
if test -f 'stripnl.h'
then
echo shar: will not over-write existing file "'stripnl.h'"
else
cat << \SHAR_EOF > 'stripnl.h'
/* returned from the function stripnl */
#define L_BUFOVER -2
#define L_TOOLONG -1
#define L_BADFORM 0
#define L_SUCCESS 1
SHAR_EOF
if test 125 -ne "`wc -c < 'stripnl.h'`"
then
echo shar: error transmitting "'stripnl.h'" '(should have been 125 characters)'
fi
fi
exit 0
# End of shell archive
More information about the Mod.sources
mailing list