MBUF problems
Charlie C. Kim
cck%cucca.columbia.arpa at BRL.ARPA
Tue Feb 18 04:58:41 AEST 1986
SYSTEM type: VAX + DEQNA
OPERATING System: 4.2BSD and Ultrix 1.1
GENERAL area: MBUF handling.
PROBLEM: panic: trap type 9 (Protection fault) in ip_output.
DIAGNOSIS: ip_output is called with a bogus mbuf by ip_forward.
ANALYSIS: This results because a basic assumption of the dtom macro is
violated. dtom assumes that mbufs are aligned on 128 byte boundaries
and that all the data is contained in that 128 bytes. For most mbuf
this is true, and it is valid to simply mask off the low order bits.
Unfortunately, for mbufs whose data is in the page pool, the data is
not in the same 128 bytes. (As a matter of fact, the data pointer
points to some virtual address behind the mbuf.)
On a vax, this would not be seen very often since pages in the pool
generally don't get used until the data size exceeds CLBYTES
(CLSIZE*NBPG) (1024). Another mitigating factor is the fact that the
private page pool is only used "when copying data from a user process
into the kernel, and when bringing data in at the hardware level".
In particular, the Ultrix 1.1 DEQNA driver uses the private page pool.
(The DEUNA, and other devices may also be affected).
CURE: I'm not sure there is an easy cure, but I've outlined in what
I think is best to worst solution. Hopefully, this is fixed in 4.3BSD
or Ultrix 1.2.
1. Fix dtom. This may not be easy. Though we can check
whether the data pointer is in the data pool or in the mbuf space, it
may be difficult to trace back pointers in the data pool. Basically,
the problem arises from the fact that copies are made by duplicating
the page entries instead of copying the data. To handle traceback, we
would need some page table which told us which mbuf a page was
associated with; if more than one mbuf could be associated with a data
page, then things would be a real hassle.
2. Drop usage of dtom where necessary. This would require
careful rewrites of portions of the code. This could be done by
dropping all usages of dtom or by tracing back where dtom could be
used. Some of this is easy, but some nontrivial rewriting would
definitely be necessary. For example, in in_input, the ip reassembly
code would have to be reworked.
3. Drop usage of the private page pool. This is undesirable;
though there should be no reason why it wouldn't work.
Charlie C. Kim
User Services
Center for Computing Activities
Columbia University
New York, NY 10025
More information about the Comp.unix.wizards
mailing list