deadlock caused by error in sys1.c, exit()
utzoo!decvax!genrad!grkermit!masscomp!mit-vax!eagle!harpo!floyd!cmcl2!philabs!mcvax!philmds!johan
utzoo!decvax!genrad!grkermit!masscomp!mit-vax!eagle!harpo!floyd!cmcl2!philabs!mcvax!philmds!johan
Thu Apr 21 19:40:16 AEST 1983
Lately we found two deadlock situations in the UNIX
kernel while testing our own 68000 system. Both deadlocks
are caused by a glitch in the exit() routine in the kernel.
Let me first describe the deadlocks.
Scenario 1
- process A, actually LPD, with shared text, is 'exec'ed.
xalloc() is called and a new text segment is created with
XWRIT on, because it is not yet swapped out.
- process A forks and creates process B running the same
code. x_count and x_ccount are incremented to 2. There
is not enough core for process B, so it is swapped out
decrementing x_ccount to 1.
- At that very moment another process C wants to grow but
this cannot be done in core, so swaps itself out using
swbuf1.
- process A decides to exit, calls xfree() (p_textp in the
proc[] entry for process A is cleared and x_count drops
to 1). xfree() calls xccdec() (x_ccount drops to 0), but
because XWRIT is on it starts swapping out the text seg-
ment using swbuf2.
- The scheduler needs memory and decides to swap out pro-
cess A, but needs a swap buffer, so sleeps waiting for
swbuf2 after setting B_WANTED.
Well, here we are: the swap transfer for the text of process
A is done and process A is made runnable, but cannot be run
because it is being swapped out. The scheduler, however,
cannot be run either, because it is sleeping on swbuf2 and
will never be woken up because process A will never be
swapped in.
Another scenario goes as follows:
Scenario 2
- That same process A and its child process B exist:
x_count==2, x_ccount==1 since process B is swapped out.
- Again, process A exits, clearing x_ccount, locking that
text segment with XLOCK and swapping out the 'dirty'
text.
- The scheduler, again, decides to swap out process A, but
succeeds this time. There is no need for xlock(), since
the reference to that text segment is cleared in xfree()
called by exit().
- Process B is swapped in by the scheduler, but now the
scheduler wants to swap in the text segment of process B.
So xlock() is called, finding the text locked, which
causes the scheduler to sleep waiting for XLOCK to be
cleared.
Here we have our second deadlock: the scheduler waiting for
process A to clear XLOCK, process A waiting to be swapped in
by the scheduler.
In my opinion the problem is caused by the code in
exit(). The process should be SLOCK'ed during xfree(). The
code should be:
p->p_flag |= SLOCK;
xfree();
p->p_flag &= ~SLOCK;
This problem exists in UNIX-V7, BSD 2.X and SYSTEM- III,
at least for the PDP-11.
Johan W. Stevenson
Philips, S&I, PMDS
decvax!mcvax!philmds!johan
More information about the Comp.unix.wizards
mailing list