Kernel mods and RTIngres

Wed Jun 5 10:46:00 AEST 1985

	I have been following with interest the recent discussion on
concurrency managemenet in RTI's INGRES, particularly Doug Gwyn's
increasingly acerbic broadsides.  As someone who works at RTI and is
familiar with the issues involved, I believe I am in a position to
correct some of the misperceptions Doug has been sharing with the net.

	Briefly, Doug's arguments are:  it is reprehensible of us to
require our users to install a pseudo-device driver for INGRES
concurrency control.  We have no right to expect our users to make
kernel mods in order to run INGRES, a mere applications program.
Writing a user-level lock manager is straightforward, albeit different
for different versions of UNIX.  Furthermore, there are always the
"flock" system call of 4.2bsd, and the "lockf" system call of the
/usr/group standard.  Finally, Doug says that INGRES does not even make
use of the lock pseudo-device ("concurrent updates are not supported"),
so that making the kernel mod is a pointless exercise.  Doug clearly
implies that we at RTI don't know what we're doing.  His last word (so
far) is:  "I'm not an expert on database systems (yet), but I recognize
poor software design when I see it."

	I'll dispose of the easiest complaint first by noting that it
is simply incorrect that we do not make use of the lock pseudo-device.
We support concurrent updates, as well as multi-statement
transactions.  We use the locking pseudo-device to ensure that all
concurrent transaction executions are serializable, while still
maximizing concurrent access to shared data -- precisely what
concurrency management in a DBMS is all about.

	I'm certainly not wild about the idea of a lock device driver,
and I don't think anyone else at RTI is, either.  We have been and will
continue to investigate alternatives.  However, there are serious (not
necessarliy disqualifying) disadvantages to all alternatives we have
found so far.

	The "flock" system call, which locks an entire file at a time,
is inadequate for our purposes; we need a finer granularity of locking
than that.  The "lockf" system call, which permits locking of an
arbitrary, contiguous subsection of a file, is getting closer.  But it
is currently available on few, if any, of the systems on which we offer
INGRES.  Neither call provides more than two lock modes, does any
deadlock detection, or any reasonable cleanup after a process exits
abnormally (merely releasing the locks that process holds is NOT
enough!).  These are all serious shortcomings.  Granted, we could make
INGRES use these calls; but we would be offering a product with
significantly reduced functionality compared with what we now offer.

	The sad fact is that UNIX offers grossly inadequate concurrency
control for a real DBMS.  Anyone who doubts this might do well to look
at silly, old VAX/VMS; its lock manager puts UNIX to shame.

	That leaves us with the possibility of a user-level lock
manager process.  Such a process presumably receives lock request
messages on some sort of named message channel, and sends response
messages back.  Named FIFO's and System V messages come to mind for
System V, and sockets for 4.2bsd.  The three major problems with this
approach are:  (1) how does the lock manager (asynchronously) find out
about abnormal termination of a client?; (2) if every client requires
its own communications channel, how does the lock manager support more
than 20 simultaneous clients, given UNIX's open-file limit?  (Yes, this
IS a requirement!); and (3) a lock request will now take 4 system
calls' worth of overhead, rather than 1 (client writes request, server
reads, server writes response, client reads).

	Spiros Triantafyllopoulos has contributed to the discussion by
saying that INGRES is slow, and spends too much time in inter-process
communication.  Spiros appeared to be agreeing with Doug, yet Spiros'
goals are diametrically opposed to Doug's.  Doug brushes aside the
performance issue by saying that a user-mode lock manager will perform
"acceptably;" but will it be acceptable to Spiros?

	The point of all this is not to insist that the pseudo-device
driver is a good idea, or to say that all other solutions are
unworkable.  Rather, I hope I have shown that the issues are far from
cut-and-dried:  that every possible approach, including the
pseudo-device driver, has serious disadvantages.  As I mentioned above,
we are actively investigating alternatives.  We welcome constructive
suggestions as well as just expressions of preference, particularly
when they shed more light than heat on this complex subject.

Jim Shankland
..!ucbvax!mtxinu!rtech!jas
..!ihnp4!pegasus!rtech!jas