minor bug in dump, can cause system to hang.
Dave Martindale
dave at onfcanim.UUCP
Tue Oct 1 05:14:23 AEST 1985
In article <1763 at brl-tgr.ARPA> davet at RAND-UNIX.ARPA (Dave Truesdell) writes:
>Index: dump/dumptraverse.c--etc
>
>Description:
> Dump would hang the system while doing incremental dumps (levels > 0),
> during pass II.
>
> When doing dumps from a raw device, dump does not force all reads
> to be done in multiples of DEV_BSIZE byte chunks. In most cases
> drivers seem to handle this correctly, but one obscure case has
> caused one of our systems (a VAX 11/785 running 4.2BSD) to hang.
>
>
>Repeat-By:
> Arrange for a raw read (size != n*512) of a directory to fail.
>
> In our case, a empty directory (containing ".", and "..") occupied
> a block which was forwarded by the HP driver. When dump attempted to
> read the directory entry ( 24 bytes long ), the system hung.
>
>Fix:
>
> Force bread to do all reads in multiples of DEV_BSIZE byte blocks.
> However, for efficiency, I have added a seperate version of bread
> called raw_bread, that is used in pass II.
But why fix dump when the problem is almost certainly in the disk driver?
Looking at the code in hp.c, we see that when a bad block is forwarded,
the replacement block is always read in its entirety, even if you asked
for less data, and thus the Massbus adapter has been set up to correctly
map only the lesser amount of data. The fix would seem to be easy, just
make the following change to the BSE case of the switch in hpecc():
*** /tmp/hp.c Mon Sep 30 14:57:02 1985
--- hp.c Mon Sep 30 15:05:32 1985
***************
*** 1024,1030
sn = bn%st->nspc;
tn = sn/st->nsect;
sn %= st->nsect;
! mbp->mba_bcr = -512;
rp->hpof &= ~HPOF_SSEI;
#ifdef HPBDEBUG
if (hpbdebug)
--- 1024,1030 -----
sn = bn%st->nspc;
tn = sn/st->nsect;
sn %= st->nsect;
! mbp->mba_bcr = -MIN(512, bp->b_bcount-(int)ptob(npf));
rp->hpof &= ~HPOF_SSEI;
#ifdef HPBDEBUG
if (hpbdebug)
Now, I don't have a filesystem that has a directory in a bad block (as far
as I know), so I can't test this under the same conditions. But the old code
is clearly wrong. Why not fix the hp driver and then try the original version
of dump and see if everything works as it should?
By the way, exactly the same bug appears in the "up" Unibus disk driver too.
More information about the Comp.unix.wizards
mailing list