DMA (was: Re: *nix performance)
Brian Cuthie
brian at umbc3.UMD.EDU
Fri Oct 28 03:53:33 AEST 1988
[as I slip into my asbestos suit]
I have decided to respond to several postings with this single response
rather than tie up net bandwidth with followups to followups etc.
First, I humbly apologize for two things: 1) If I came off sounding
omniscient about the AT, I'm sorry. I have, in past years, designed several
disk controllers for the PC and written suitable drivers that used DMA.
2) for incorrectly extrapolating PC expertise to cover the design of the
AT. Some of my statements about DMA on the AT were clearly wrong as I made
some bad assumptions about that particular part of the AT design.
However, most of my points about DMA are true in the general case. It is
true that, after pawing through the AT technical reference manual, the AT
has some serious deficiencies in it's DMA design. This, however, does not
change the fundamental reasons for using DMA in most systems.
The Intel 80286/80386 processors have the unique ability to behave much like
a DMA controller. That is, they can transfer data in single memory cycles (
please note that a memory cycle is not the same as a clock cycle). In this
mode, using the string transfer instructions, the 80*86 is capable of
generating address and timing signals without placing data on the bus. Thus
the peripheral or memory is free to drive the data bus directly to the
recipient of the data (memory for INS instructions, and peripheral for OUTS
instructions). There are some instances when DMA controllers will buffer
data however these are rare. Data usually flows between the peripheral and
memory in single memory cycles unless, of course, the peripheral's
controller cannot transfer data at memory speeds (unlikely since most
peripheral controllers have some buffer cache).
Normally, however, a processor does not have this ability. Thus, to transfer
a block from a peripheral to memory requires that the processor read a
byte/word from the peripheral and subsequently write that byte/word to
memory. This operation, even under the best of caching scenarios, requires
at least two memory accesses. It can be seen, then, that a processor
lacking this special ability could never be as fast as a well designed
DMA subsystem.
DMA controllers seize the bus by placing the CPU in a HOLD state. In this
state, the CPU is not able to perform any external bus accesses. Instead,
all address and timing information is generated by the DMA controller. When
the DMA controller has placed the CPU into a HOLD state, and has asserted
the appropriate address onto the address bus, it asserts either MEMREAD (for
a transfer from memory to the peripheral) or MEMWRITE (to transfer from a
peripheral to memory). The device which has requested DMA recognizes these
signals in conjunction with the DMA ACK signal and data is transfered over
the data bus directly between the peripheral and memory with no intermediate
lay-overs.
It can be seen that during this transfer, the CPU will remain idle, once it
has completed it's current instruction, until it can regain control of the
bus. Therefor, most DMA controllers offer the ability to generate limited
burst DMA transfers. The Intel 8237 is limited to either single transfers or
complete block transfers. Other DMA controllers, such as the Motorola 68445
(I believe that is the correct part number), allow the burst length to be
programmed over a wider range. Limiting the burst length allows some
interleaving of CPU and DMA memory accesses.
Interleaving CPU and DMA access to memory is usually less desirable than
complete block transfers since there is substantial overhead in placing the
CPU into a HOLD state. This problem can be solved by multiported memory
designs. However, since processor speeds outstrip memory speeds (that is
as CPUs get faster, they spend more time waiting for memory) there is little
advantage to this scheme.
In summary, DMA is used primarily because, in a well designed system, it can
almost always be made to be more than twice as fast as the CPU in doing
peripheral to memory transfers. However, memory bandwidth is limited and
thus you must rob peter to pay paul, so the idea that DMA allows concurrent
CPU and peripheral access to memory is somewhat mislead.
-brian
More information about the Comp.unix.microport
mailing list