reference about mbufs

Chris Torek chris at mimsy.umd.edu
Sun Jul 22 05:40:02 AEST 1990


In article <1990Jul5.175406.22944 at Neon.Stanford.EDU> david at Neon.Stanford.EDU
(David M. Alexander) writes:
>Does anyone know a book or article that discusses the mbuf structure
>and the functions and macros to manipulate them?

Hmm... do you mean `4.1BSD BBNNET mbufs', `4.2BSD mbufs', `4.3BSD mbufs',
`4.3BSD-tahoe mbufs', or `4.3BSD-reno mbufs'?  They are all different
(and probably differ from various Ultrix mbufs and maybe also 4.1a, b,
and c mbufs, but I never saw 4.1[abc]; now that SunOS has STREAMS one
would hope the kernel group settled on one kind of memory allocator as
well...).

4.3-tahoe mbufs are probably the simplest to easily explain:

struct mbuf {
	struct	mbuf *m_next;		/* next buffer in chain */

This links together mbufs that make up one (packet/group/whatever) so that
the amount of data in a data-chunk can be bigger than the maximum size of
a single mbuf.

	u_long	m_off;			/* offset of data */

This gives the offset from the base of the mbuf (the address of the entire
`struct mbuf') to the data.  For `normal' mbufs the data are somewhere in
m_dat[].  For `big' mbufs (`mclusters') the data are in a separate `page'
(typically 1Kbyte, i.e., not necessarily a hardware page) and the offset
is large (>= sizeof(struct mbuf)).

	short	m_len;			/* amount of data in this mbuf */

Thus the length of a complete packet is the sum of the lengths of all the
mbufs found via m_next's.

	short	m_type;			/* mbuf type (0 == free) */

One of the magic type codes.

	u_char	m_dat[MLEN];		/* data storage */

Up to 112 bytes of data.

	struct	mbuf *m_act;		/* link in higher-level mbuf list */

Various uses.  Mainly for datagram protocols: several packets are linked
together via m_act pointers.  Conceptually, following m_next pointers
`assembles' each packet, while following m_act pointers `lists' each
packet.  The m_act pointers are set only in the `top' mbufs:

	--------
  socket buffer: so->so_sb.sb_rcv
	--------
	    | sb_mb
	    v
	+-------+ m_act	+-------+ m_act	+-------+ m_act
	| pkt 1	|------>| pkt 2 |------>| pkt 3 |--->nil
	+-------+	+-------+	+-------+
	    | m_next	    | m_next	    | m_next
	    v		    v		    v
	+-------+	+-------+	+-------+
	|	|	|	|	|	|
	+-------+	+-------+	+-------+
	    | m_next	    | m_next	    | m_next
	+-------+	   nil		+-------+
	|	|			|	|
	+-------+			+-------+
	    | m_next			    | m_next
	   nil				   nil

};

functions/macros:

	MGET(m, waitflag, type)
sets `m' to point to a new mbuf of type `type'.  waitflag is either
M_DONTWAIT (if cannot sleep; then m may be set to nil) or M_WAIT (if
can sleep; then m will never be nil).

	M_CLALLOC(m, i)
Gets `i' mclusters (i must be 1).  Never waits; sets m to nil if there are
none.

	M_HASCL(m)
True iff m is an mcluster rather than a regular (tiny) mbuf.

	MTOCL(m)
Gets base of cluster page given an mcluster.

	MCLGET(m)
Changes m from a regular mbuf to an mcluster, if there is space.  If
not, leaves m a regular mbuf.  m->m_len is set to MCLBYTES on success,
or MLEN on failure (so, e.g., `M_HASCL' will tell whether it succeeded).

	MCLFREE(m)
Puts m on the mcluster free list.

	MFREE(m, n)
Puts m on the free list; sets n to what m->m_next used to be.  To free
a chain you could use
	while (m) { MFREE(m, n); m = n; }
Automatically knows when to use MCLFREE.

	struct mbuf *m_get(int waitflag, int type);
Returns a new mbuf, exactly like MGET except incurring a function call and
using less space.

	struct mbuf *m_getclr(int waitflag, int type);
Returns a new mbuf like m_get, but zeroes out all the data.

	struct mbuf *m_free(struct mbuf *m);
Puts m on the free list like MFREE; returns the old m->m_next.

	struct mbuf *m_more(int waitflag, int type);
Internal use (for MGET).

	struct mbuf *m_copy(struct mbuf *m, int off, int len);
Copies the data from the mbuf chain headed by `m' into new mbufs
(so that it can be modified without affecting other users of the
same data).  Skips the first `off' bytes of data; copies at most
`len' bytes.  Thus, to copy no more than 32 bytes from the chain
headed by `m', after skipping over the first 4 bytes, use
	mcopy = m_copy(m, 4, 32);
A `len' of M_COPYALL means `copy until end of chain'.

	struct mbuf *m_pullup(struct mbuf *m, int len);
`Pulls' a minimum of `len' bytes of data into the first mbuf in
the chain, possibly replacing the chain (as if via m_copy) in the
process.  Used to force entire packet headers into a single mbuf.

	mtod(m, type)
Gives (as type `type') the address of the first byte of data in m.
Used as, e.g.,
	m = m_pullup(m, sizeof(struct ip));	/* get entire IP header */
	struct ip *ip_header = mtod(m, struct ip *);

	dtom(pointer)
Turns an arbitrary data pointer into the corresponding mbuf (via trickery).
dtom() might someday go away.

Something important not mentioned above: packets received from an interface
are put on the appropriate protocol's input queue with the first mbuf
containing a pointer to the `ifnet' structure as its first item.  That
is, after receiving an IP packet, an Ethernet driver puts an mbuf chain
onto `ipintrq' that looks like:

	offset 0: *mtod(m, struct ifnet **) points back to Ethernet I/F
	offset sizeof(struct ifnet *): IP header, followed by data

The `IF_DEQUEUEIF' macro handles this little idiosyncracy.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris at cs.umd.edu	Path:	uunet!mimsy!chris



More information about the Comp.unix.questions mailing list