Summary of (Re: Recompiling News 2.11 with dbm routines using gcc1.34)

Sun Mar 19 18:04:23 AEST 1989

In article <180 at dms3b1.UUCP> dave at dms3b1.UUCP (Dave Hanna) writes:
|>In article <633 at icus.islp.ny.us> lenny at icus.islp.ny.us (*I*) write:

...[Talk about me trying to get the news up to patch level 17, recompiled
    with the dbm routines, and lastly using gcc]...

|>With your experience and history, I have no doubt that you'll be successful,
|>but I want to wish you good luck, and do let us know how it works out.

Well thanks for the vote of confidence.  I first want to thank all those
who replied, a little too many to mention here.  I must have gotten about
15 copies of the mdbm patches to use with expire (*sigh*), but thanks 
nevertheless.   I'll enclose that patch that allows you to use the mdbm
routines in place of the dbm routines, since the mdbm routines use a 
different extension on the indexed files.  This way (using the patch), those
files will be linked, and you'll have no problem. (Make sure you compile
with the -DDBM and -DMDBM flags)

Well here's a quick summary of my trials and tribulations of compiling
this bugger.   I'm finally up to patchlevel 17 of the news, and it is
using the mdbm routines.  This wasn't without some headaches along the
way.  I must have compiled the netnews software at least 5 times this
past week (ugh).  Patchlevel 17 did bring apon some bugs, but after
accumulating a bunch of patches that were posted to the net shortly after
the release of that, most, if not all, of the bad bugs were banished
forever (or so I hope).

I did compile the news with the gcc compiler (once) but gave up after
it started core dumping all over the place.  It wasn't something I
really wanted to deal with at 3:00AM in the morning.  I recompiled with
the UNIX PC compiler, and installed it ... just in time for my days worth
of news to be sent.

A few people helpfully pointed out the problems with gcc and the dbm 
routines (where the core dump probably originated at).  Brant Cheikes
kindly gave me this information, but I also got it from a few other
sources :-)

|Some of the dbm routines return structs, rather than struct *'s.
|Pcc and Gcc use incompatible struct-passing facilities.  The upshot is
|that if you compile with Gcc, you either must also compile your dbm
|routines with Gcc, *or* you must use the -fpcc-struct-return option
|(check that, I may not have gotten the name just right).
|
|Personally, I have a /usr/lib/libpdbm.a containing the pcc-compiled
|dbm routines, and a libgdbm.a for the Gcc-compiled routines.  Why?
|Because I've heard that the pcc-struct-return option (new in 1.34, I
|believe) is a bit buggy.
|...
|Brant Cheikes
|University of Pennsylvania, Department of Computer and Information Science
|brant at manta.pha.pa.us, brant at linc.cis.upenn.edu, bpa!manta!brant

Gcc was a nice thought, but right now I don't have time to deal with
possibly buggy code for the -fpcc-struct-return option.   Maybe when it's
fixed in 1.35, I'll deal with it again.

Robert Granvin (rjg at sialis) pointed out that he is (and has been for a while) 
using the dbz.c set of routines that basically mimics the dbm routines store()
and fetch(), which will satisfy the requirements for netnews at least.  It's
a short program so I'll include it here.  All was necessary was to define
-DDBM, and link it with dbz.o ... (instead of -ldbm or -lmdbm)...  This seems
a little easier, but who knows.

A good idea that was pointed out to me, unfortunately 1 day too late, was
a way to save the current history.  I just removed the history.d/[0-9] files,
and then did the /usr/lib/news/expire -R to [re]build the history files 
as DBM files (note this took over 1.5 hours on my machine with all the news 
I had).   Kent Forschmiedt (kent at happym) suggested I do this:

|o 2.11.14 does not keep a whole history file, only the ten pieces in 
|   history.d, so cat them together and sort them:
|
|   cat history.d/[0-9] | sort +1.6 -2 +1 >history
|
|o When everything has compiled, run expire -R to create the dbm files.
|
|o If you are using rn, you need to fix it, too.  This is easy.  Run
|  Configure as usual, do anything you need to do to it, and add
|  "#define DBM" to defs.h.

News unbatches *so* much faster now.  It used to take about 12-15 hours
to unbatch 8000-10000 blocks of compressed news.  Now it's done in about
2-4 hours.  A big difference!   I've noticed that expire's take longer,
but that's because it has to recreate the history dbm files each time
expire is run.  This I don't care too much about, since it's chugging along
at 3:00AM, and what do I care.

For those who care, here's what my current history files look like for 15
days worth of history:

2480 -rw-r--r--  1 news    news  1269622 Mar 19 02:27 /usr/lib/news/history
7980 -rw-r--r--  2 news    news  4085760 Mar 19 02:27 /usr/lib/news/history.dat
   9 -rw-r--r--  2 news    news     4352 Mar 19 02:26 /usr/lib/news/history.dir
   9 -rw-r--r--  2 news    news     4352 Mar 19 02:26 /usr/lib/news/history.map
7980 -rw-r--r--  2 news    news  4085760 Mar 19 02:27 /usr/lib/news/history.pag

Well if anything else is needed that I can be some assistance to, please
let me know.

-Lenny

#! /bin/sh
# This is a shell archive.  Remove anything before this line, then unpack
# it by saving it into a file and typing "sh file".  To overwrite existing
# files, type "sh file -c".  You can also feed this as standard input via
# unshar, or by typing "sh <file", e.g..  If this archive is complete, you
# will see the following message at the end:
#		"End of shell archive."
# Contents:  mdbm.pat dbz.c
# Wrapped by lenny at icus on Sun Mar 19 03:02:47 1989
PATH=/bin:/usr/bin:/usr/ucb ; export PATH
if test -f mdbm.pat -a "${1}" != "-c" ; then 
  echo shar: Will not over-write existing file \"mdbm.pat\"
else
echo shar: Extracting \"mdbm.pat\" \(1034 characters\)
sed "s/^X//" >mdbm.pat <<'END_OF_mdbm.pat'
X*** expire.c.old	Mon Mar 13 17:11:14 1989
X--- expire.c	Mon Mar 13 17:38:30 1989
X***************
X*** 841,846 ****
X--- 851,864 ----
X  			(void) sprintf(tempname,"%s.dir", ARTFILE);
X  			(void) strcpy(rindex(NARTFILE, '.'), ".dir");
X  			(void) rename(NARTFILE, tempname);
X+ #ifdef	MDBM
X+ 			(void) sprintf(tempname, "%s.dat", ARTFILE);
X+ 			(void) strcpy(rindex(NARTFILE, '.'), ".dat");
X+ 			(void) rename(NARTFILE, tempname);
X+ 			(void) sprintf(tempname, "%s.map", ARTFILE);
X+ 			(void) strcpy(rindex(NARTFILE, '.'), ".map");
X+ 			(void) rename(NARTFILE, tempname);
X+ #endif	/* MDBM */
X  		}
X  #endif
X  	}
X***************
X*** 1251,1256 ****
X--- 1284,1297 ----
X  		(void) UNLINK(tempname);
X  		(void) sprintf(tempname,"%s.dir", NARTFILE);
X  		(void) UNLINK(tempname);
X+ 		
X+ #ifdef	MDBM
X+ 		(void) sprintf(tempname, "%s.dat", NARTFILE);
X+ 		(void) UNLINK(tempname);
X+ 		(void) sprintf(tempname, "%s.map", NARTFILE);
X+ 		(void) UNLINK(tempname);
X+ #endif	/* MDBM */
X+ 
X  #else	/* !DBM */
X  		(void) UNLINK(ARTFILE);
X  #endif	/* !DBM */
END_OF_mdbm.pat
if test 1034 -ne `wc -c <mdbm.pat`; then
    echo shar: \"mdbm.pat\" unpacked with wrong size!
fi
# end of overwriting check
fi
if test -f dbz.c -a "${1}" != "-c" ; then 
  echo shar: Will not over-write existing file \"dbz.c\"
else
echo shar: Extracting \"dbz.c\" \(5385 characters\)
sed "s/^X//" >dbz.c <<'END_OF_dbz.c'
X/*
X
Xdbz.c  V1.0 
X
XCopyright 1988 Jon Zeeff (umix!b-tech!zeeff)  
XYou can use this code in any manner, as long as you leave my name on it
Xand don't hold me responsible for any problems with it.
X
XThese routines replace dbm as used by the usenet news software 
X(it's not a full dbm replacement by any means).  It's fast and 
Xsimple.  
X
XBSD sites will notice some savings in disk space.  Sys V sites without 
Xdbm will notice much faster operation.  
X
XThis code relies on the fact that news stores a pointer to the history 
Xfile as the dbm data and that keys always end with > (meaning that a 
Xfull key should never match a partial key and that you don't need to 
Xkeep track of key size).  It doesn't store another copy of the key 
Xlike dbm does so it saves disk space.  All you can do is fetch() and 
Xstore() data.  
X
XJust make news with the dbm option and link with dbz.o.
X
X*/
X
X/* 
X   Set this to the something several times larger than the maximum # of 
X   lines in a history file.  It should be a prime number.
X*/
X
X#define INDEX_SIZE 99991 
X
X#include <stdio.h>
X#include <sys/types.h>
X#include <sys/stat.h>
X#include <fcntl.h>
X#include <string.h>
X
Xlong	lseek();
X
Xtypedef struct {
X	char	*dptr;
X	int	dsize;
X} datum;
X
Xstatic char	buffer[1024];   /* used for fetch returns */
Xstatic int	data_file;
Xstatic FILE *index_file = NULL;
Xstatic long	data_ptr;
X
Xdbminit(name)
Xchar	*name;
X{
X	char	string[1024];
X	FILE * fopen();
X
X	if (index_file != NULL) 
X		return - 1;    /* init already called once */
X
X	data_file = open(name, O_RDWR);
X
X	strcpy(string, name);
X	strcat(string, ".pag");
X	index_file = fopen(string, "r+");
X
X	if (index_file == NULL) 
X		return - 1;
X
X	return 0;
X
X}
X
X
Xdbmclose()
X{
X	if (index_file) {
X		fclose(index_file);
X		index_file = NULL;
X		close(data_file);
X	}
X	return 0;
X}
X
X
Xdatum
Xfetch(key)
Xdatum key;
X{
X	long	index_ptr;
X	long	data_ptr;
X	datum output;
X	char	*strchr();
X	long	hash();
X	long	get_ptr();
X	char	*p;
X
X	for (index_ptr = hash(key.dptr, key.dsize); (data_ptr = get_ptr(index_ptr)) != -1; ++index_ptr) {
X		/* we got a pointer into the history file, go see if it's the right one */
X
X		lseek(data_file, data_ptr, 0);
X		read(data_file, buffer, (unsigned)key.dsize);
X		p = buffer;	/* convert string to lower case, as inews calls us that way */
X		while (*p) {	/* - ahby at bungia.mn.org */
X			*p = tolower(*p);
X			p++;
X		}
X
X		/* we can get away without the length info since we know that a 
X       lhs portion of one key != any full key */
X
X		/* We use size - 1 since key has a null but hist file doesn't */
X
X		if (strncmp(key.dptr, buffer, key.dsize - 1) == 0) {
X			/* we found it */
X			output.dptr = (char *) & data_ptr;
X			/* output.dptr = buffer;  /* replaced per bug fix */
X			/* output.dsize = strchr(buffer,'\n')-buffer + 1; not used by news */
X			return output;
X		}
X	}
X
X	/* we didn't find it */
X
X	output.dptr = NULL;
X	output.dsize = 0;
X	return output;
X
X}
X
X
X/* add an entry to the database */
X
Xstore(key, data)
Xdatum key;
Xdatum data;
X{
X
X	return put_ptr(hash(key.dptr, key.dsize), *((long *)data.dptr));
X
X}
X
X
X/* get a data file pointer from the specified location in the index file */
X
Xstatic long	
Xget_ptr(index_ptr)
Xlong	index_ptr;
X{
X	long	data_ptr = 0;
X	int	count;
X
X	/* seek to where it should be */
X	fseek(index_file, (long)(index_ptr * sizeof(long)), 0);
X
X	/* read it */
X	count = fread((char *) & data_ptr, sizeof(long), 1, index_file);
X
X	if (count != 1 || data_ptr == 0) 
X		return - 1;
X
X	data_ptr -= sizeof(long);
X
X	return data_ptr;
X}
X
X
X/* put a data file pointer into the specified location in the index file */
X/* move down further if slots are full */
X
Xstatic 
Xput_ptr(index_ptr, data_ptr)
Xlong	index_ptr;
Xlong	data_ptr;
X{
X	long	i = INDEX_SIZE;
X
X	/* find an empty slot */
X	while (i-- && get_ptr(index_ptr) != -1) 
X		index_ptr = ++index_ptr % INDEX_SIZE;
X
X	/* seek to spot */
X	fseek(index_file, (long)(index_ptr * sizeof(long)), 0);
X
X	/* write in data */
X	data_ptr += sizeof(long);
X	fwrite((char *) & data_ptr, sizeof(long), 1, index_file);
X
X	if (i > 0) 
X		return 0;
X	return - 1;
X
X}
X
X
X/* 
X   A hash function 
X*/
X
X/* some random # tables */
X
Xstatic int	tab1[16] = {
X	1, 13, 17, 7, 26, 53, 32, 36, 39, 41, 45, 48, 6, 60, 61, 57};
X
X
Xstatic long	tab2[64] = {
X	2839722806L,
X	2810802562L,
X	3339723161L,
X	1058549169L,
X	1994454359L,
X	3332972616L,
X	2024961869L,
X	1295774230L,
X	1401270575L,
X	550463474L,
X	4149832130L,
X	593802817L,
X	19870556L,
X	217796119L,
X	2030513628L,
X	2224253180L,
X	4086624884L,
X	2108878202L,
X	4081713897L,
X	1450431192L,
X	2682324517L,
X	3815655141L,
X	3437116049L,
X	1509200303L,
X	3320765546L,
X	592632684L,
X	4115672507L,
X	891714445L,
X	2101308635L,
X	1818339936L,
X	1641458119L,
X	2428086168L,
X	955496606L,
X	3732124452L,
X	263468725L,
X	4181304149L,
X	1757671992L,
X	4105130351L,
X	415998664L,
X	266182449L,
X	4145924110L,
X	3676915477L,
X	3152392912L,
X	3169100473L,
X	3094692794L,
X	449386310L,
X	279014040L,
X	1239368031L,
X	4072577141L,
X	2519571281L,
X	1138855693L,
X	2110586520L,
X	2544733821L,
X	621089071L,
X	828396835L,
X	901327585L,
X	1528193535L,
X	2448354394L,
X	3531633039L,
X	3272908074L,
X	2376181107L,
X	1827926286L,
X	2717853871L,
X	3969548575L};
X
X
Xstatic long	
Xhash(string, size)
Xchar	*string;
Xint	size;
X{
X	int	value1 = 0;
X	long	value2 = 0;
X
X	register unsigned	c;
X
X	while (size--) {
X		c = *string++;
X		value2 += tab2[(value1 += tab1[c & 15]) & 63];
X		value2 += tab2[(value1 += tab1[(c >> 4) & 15]) & 63];
X	}
X
X	if (value2 < 0) 
X		value2 = -value2;
X
X	return value2 % INDEX_SIZE;
X
X}
X
X
END_OF_dbz.c
if test 5385 -ne `wc -c <dbz.c`; then
    echo shar: \"dbz.c\" unpacked with wrong size!
fi
# end of overwriting check
fi
echo shar: End of shell archive.
exit 0
-- 
Lenny Tropiano             ICUS Software Systems         [w] +1 (516) 582-5525
lenny at icus.islp.ny.us      Telex; 154232428 ICUS         [h] +1 (516) 968-8576
{talcott,decuac,boulder,hombre,pacbell,sbcs}!icus!lenny  attmail!icus!lenny
        ICUS Software Systems -- PO Box 1; Islip Terrace, NY  11752