Raw disk I/O

Mon Feb 26 01:47:31 AEST 1990

In article <1990Feb23.223030.9851 at ladc.bull.com> fmayhar at hermes.ladc.bull.com writes:
>}All the companies that do databases under Unix use raw partitions,
>}and one of the reasons is throughput is much higher.

And depending upon the application, some of them rarely use the raw
partition itself.  The use the block device to access data
on the disk partiton so that they bypass the problems with triple 
indirection.  The raw partition might only be used for special parts of 
the database (like backing it up, or some forms of sequential access).

>Well, we don't seem to have _Advanced_Unix_Programming_ around here.  I can
>say that we weren't doing anything special, just blocking I/O.  We got about
>40k-45k/s.  With block-mode I/O, we get from around 90 k/s to as much as
>200 k/s or a little more, depending on the test parameters.  Are you saying
>that we could exceed these figures using raw I/O?

In applications that I have written using raw I/O, I use a very larg i/o
buffer (on the order of 300K or so) and have found that the performance
can be on the order of 10 times greater than the i/o performance when using
the same i/o buffer size and the block device.

>One of the real problems is that we have to do this for multiple users.  It's
>not easy to justify two processes per user, nor did I want to bottleneck
>everything through a single I/O process.

When reading/writing large amounts of data through the block device the
overall system performance is usually dismal, however when the same amount
of i/o is going to the raw device, the system performance is much less 
impacted by the i/o, since there is no contention for disk buffers.

>I still fail to see how this is faster than letting the block-mode device
>driver do the same thing, though.  Plus, the device driver knows about the
>idiosyncracies of the hardware, and can take advantage of them, where my
>I/O process would have to be written for the lowest common denominator.

The problems with the block driver are:

	1. The data must be copied from disk to kernel memory to user memory
	   adding an additional copy.
	2. The throughput is limited by the contention for free/available
	   block buffers in the kernel.  This is especialy apparent for
	   the output side.

The problems with the raw disk driver are:

	1. No implicit sharing of data read from disk.
	2. I/O must be a multiple of disk block size (usually 512 or 1024
	   bytes).
	3. I/O is synchronous, so small writes must wait for the actual disk
	   i/o to complete. 

So, in determining which type of access to use you must determine the
amount of data that will be flowing and the direction it will be flowing.

I use raw i/o when reading > 1 MB from a disk in large blocks (200K+).

Another thing to remember is that individual device drivers may or may 
not have a performance gain with raw i/o, but they usually do.
-- 
+-----------------------------------------------------------------------+
| Conor P. Cahill     uunet!virtech!cpcahil      	703-430-9247	!
| Virtual Technologies Inc.,    P. O. Box 876,   Sterling, VA 22170     |
+-----------------------------------------------------------------------+