Raw disk I/O
Conor P. Cahill
cpcahil at virtech.uucp
Mon Feb 26 01:47:31 AEST 1990
In article <1990Feb23.223030.9851 at ladc.bull.com> fmayhar at hermes.ladc.bull.com writes:
>}All the companies that do databases under Unix use raw partitions,
>}and one of the reasons is throughput is much higher.
And depending upon the application, some of them rarely use the raw
partition itself. The use the block device to access data
on the disk partiton so that they bypass the problems with triple
indirection. The raw partition might only be used for special parts of
the database (like backing it up, or some forms of sequential access).
>Well, we don't seem to have _Advanced_Unix_Programming_ around here. I can
>say that we weren't doing anything special, just blocking I/O. We got about
>40k-45k/s. With block-mode I/O, we get from around 90 k/s to as much as
>200 k/s or a little more, depending on the test parameters. Are you saying
>that we could exceed these figures using raw I/O?
In applications that I have written using raw I/O, I use a very larg i/o
buffer (on the order of 300K or so) and have found that the performance
can be on the order of 10 times greater than the i/o performance when using
the same i/o buffer size and the block device.
>One of the real problems is that we have to do this for multiple users. It's
>not easy to justify two processes per user, nor did I want to bottleneck
>everything through a single I/O process.
When reading/writing large amounts of data through the block device the
overall system performance is usually dismal, however when the same amount
of i/o is going to the raw device, the system performance is much less
impacted by the i/o, since there is no contention for disk buffers.
>I still fail to see how this is faster than letting the block-mode device
>driver do the same thing, though. Plus, the device driver knows about the
>idiosyncracies of the hardware, and can take advantage of them, where my
>I/O process would have to be written for the lowest common denominator.
The problems with the block driver are:
1. The data must be copied from disk to kernel memory to user memory
adding an additional copy.
2. The throughput is limited by the contention for free/available
block buffers in the kernel. This is especialy apparent for
the output side.
The problems with the raw disk driver are:
1. No implicit sharing of data read from disk.
2. I/O must be a multiple of disk block size (usually 512 or 1024
bytes).
3. I/O is synchronous, so small writes must wait for the actual disk
i/o to complete.
So, in determining which type of access to use you must determine the
amount of data that will be flowing and the direction it will be flowing.
I use raw i/o when reading > 1 MB from a disk in large blocks (200K+).
Another thing to remember is that individual device drivers may or may
not have a performance gain with raw i/o, but they usually do.
--
+-----------------------------------------------------------------------+
| Conor P. Cahill uunet!virtech!cpcahil 703-430-9247 !
| Virtual Technologies Inc., P. O. Box 876, Sterling, VA 22170 |
+-----------------------------------------------------------------------+
More information about the Comp.unix.wizards
mailing list