SCSI hiding geometry

Matt Jacob mjacob at wonky.Sun.COM
Thu Mar 15 17:06:53 AEST 1990


[ Sorry- my machine was down for a couple of days so I am late in responding
  to this.. ]

>...
>> My own personal opinion is that geometry based filesystems are
>> getting to be a bad microoptimization...
>
>But SCSI is not the only interface around, and I think there are some open
>questions about how much device-sensitivity you want in the mid level of
>the file/disk system.  That is, if you've got a more traditional disk
>interface (some of which are pretty high performance) you need to deal with
>geometry.  Do you want to ignore geometry some of the time?  It gets harder
>and harder to know how/where to make the cut.
>
>(My own personal opinion, not necessarily well substantiated, is that SCSI
>was at best premature, and at worst wrong, in trying to hide drive geometry
>from the host system.)
>

Ah, but SCSI wasn't premature- it was/is an extension of the IBM channel
concept to smaller lower-cost machines.

Granted, more 'traditional' disk interfaces need and should allow the
main CPU to know and take advantage of disk geometry. However, the
256-512kb of code to handle the 4.3 filesytem can be considered *wasted*
main CPU cycles if you can offload the processing.

>>...With the coming of SCSI-2
>> multiple command targets, it seems to me that one should just
>> concentrate on getting requests out to the target as quickly
>> as possible and let the microprocessor on the drive figure out
>> the best order do them in.
>
>This raises a sticky issue of who's in control of the disk system.
>Consider reliability issues.  Two examples come to mind.  First, in a UNIX
>file system, you probably want to have some control over the order of
>operations so that you can have some reasonable assurance that operations
>on inodes, indirect blocks, directories, and data happen in a way that will
>allow you a good chance for recovery if you crash while there are
>operations in the queue.  Second, in a database it is essential that you be
>able to control the sequencing of operations so that commits really commit,
>journaling happens when you expect, etc.

There are quite adequate mechanisms in SCSI to handle this (e.g., the *real*
use of linked commands, which provide means for specifying atomic operations
w.r.t. to multiple sets of i/o from a single initiator).

It is true that Unix itself does not provide good hooks for reliability
or database sequencing, but to criticize SCSI for allowing you to do
things your OS can't handle well to begin with is the tail wagging the
dog.

>
>Frankly, I don't want to trust J Random Microcoder to give a disk-write-
>reordering algorithm that won't screw things up.  Even if I'm assured of
>some sort of "fair" algorithm, trying to sequence things in the kernel to
>compensate for all the possible variants of reordering sounds like a pain.
>(It's also redundant in a perverse way:  You have to write code to un-do
>decisions which are going to be made for you that you don't want.)
>

Now this is a valid point, in a way. I've gone over this issue in several
different contexts (having been a microcoder in my dim past). In the
case where you have more than one decision maker, *one* must make the
choice decisisions as to optimal i/o ordering, etc., else chaos results.

In the case of distributed I/O subsystems (SCSI or otherwise), I have
found that you *have* to do things like *not* disksort on the stub cpu
side of things. If you have the BSD filesystem, you *must* specify
things like 0 rotational delay, etc., in order to *not* have the
filesystem and the i/o subsystem cancel each other out.

Ideally, one would like a a filesytem to form requests that have
precedence, priority, and cache-retention parameters. That is, the
filesystem associates with each data it wants transferred loose
statements like:

	"Write this *NOW*"

	"Write this, and hang on to it, 'coz I'll likely ask for it back soon."

	"Write this *before* Reading *that*"

and so on. I feel that we (as in the Unix commercial marketplace) are
very far from that (flame on, everyone!)....


>I think it would make the job of kernel folks a lot easier if they could
>deal with interfaces which just attempt to be fast in a predictable way,
>instead of trying to be smart.

For about two years at Sun, I had posted on my office door
a one-page printout (well, it was small font) entitled
"The Ideal and Perfect Driver". It was for the PDP-11 RK05
removable 2.5mb drive.

Also, I have kicking around at home a 200-odd word pdp-11 assembler
language rm03 driver I wrote for RT-11.

These are *very* simple. Unfortunately, I have not been able to
beg, plead, extort, bribe, or otherwise convince hardware engineers
to take such simple interfaces and run them up to a decent speed.
Ergo, complexity in s/w has been a natural result.

-matt



More information about the Comp.unix.aix mailing list