Ethernet Performance (UNIX, TCP/IP)
Gunnar Lindberg
lindberg at chalmers.UUCP
Mon Oct 15 23:44:29 AEST 1984
Recently, there has been much discussion about Ethernet performance,
with several suggestions of where to find the bottleneck(s). However,
I have not seen any description of what REALLY happens inside UNIX
when TCP/IP/Ethernet is used. Therefore I tried the following:
Hosts: 2 VAX 11/780, running UNIX 4.2 BSD, single user mode.
Ethernet: 3COM, with standard 4.2 BSD driver. No other activity
on the network.
A program in one host sent 2 Mbytes of data via TCP/IP/Ethernet
to a program in the other host. Both UNIX systems were analyzed
using the kernel profiler (KGMON, GPROF).
>From what I've heard before, I expected to find the system spending
lots of time inside the TCP/IP/Ethernet code. However, the systems
were in IDLE most of the time:
SEND % of total time RECV % of total time
"idle" 40 % "idle" 35 %
ecput 15 % ecget 25 %
in_cksum 4 % in_cksum 4 %
The problem is not the amount of TCP/IP code that has to be run,
but merely a question of end-to-end flow control and lack of
parallelism. The scenario is as follows:
SEND: write(2Mbytes); RECV: loop until 2Mbytes
read(2Mbytes);
SEND fills buffers up to max
size (2Kbytes), and gives
that to TCP.
TCP sends data on the net.
TCP receives data and
returns an acknowledgement.
The RECV program is awakened
with a 2Kbytes buffer.
TCP receives the ack and
releases SEND buffer space.
However, since RECV has not RECV consumes the data and
yet consumed the data, SEND releases buffer space.
has to wait for buffer space
to be released at RECV's host.
TCP sends window information,
telling SEND that he may send
more data.
TCP receives the window info
and wakes SEND up.
SEND fills next 2Kbytes...
etc. etc. etc.
Of course, less overhead in the TCP/IP/Ethernet code would lead
to this "SEND/RECV see-saw" toggling faster, resulting in a higher
throughput. However, most of the system's active time is spent in
the 3COM inteface routine, reading and writing memory on the 3COM
board, (ecget/ecput), which is known to be slow. Using a DMA
interface instead should give a better performance.
Unfortunately, performance will NOT be increased by use of a
dedicated network processor. A typical processor for such a task
would be a Motorola MC68000, which is MUCH slower than a VAX 11/780.
Therefore, unless we can reduce the amount of code that must be run
to implement TCP/IP, a network processor will DECREASE performance.
However, a network processor reduces the host processor's load,
which of course is a good thing, and makes it possible for the host
to consume data in one buffer, concurrently with the network processor
fetching data to the next buffer.
Now finally the question: What can we do?
1) An increase of "max buffer size" from 2K to 4/8K should
reduce scheduling overhead etc. This could be implemented
as an "ioctl" function in TCP, to be used by programs such
as "rcp" when both nodes are on the same net. I have not
tried this yet, but I plan to.
2) Design of a new protocol for usage on "reliable" local nets.
The Ethernet interface performs checksumming and drops all
erroneous packets, which means that data in a delivered packet
may be "trusted", i.e. no checksumming needed. Of course, a
LAN protocol does not need any inter-network code, although
using IP addresses would be an advantage (makes it simple to
check "same_net(addr1, addr2)" ).
3) Other suggestions?
Does anyone know which protocol SUN uses for disc transfers
(file system data and swapping)?
I am sorry for the length of this letter, but I have not seen any
discussion on UNIX-TCP/IP/Ethernet's internals before. If my
observations were obvious to everybody else, I apologize.
Gunnar Lindberg
Department of Computer Science
Chalmers University of Technology
S-412 96 Gotherburg
SWEDEN
..!mcvax!enea!chalmers!lindberg
More information about the Comp.unix.wizards
mailing list