RA81 unreliability
pep at down.FUN
pep at down.FUN
Wed Jan 30 07:49:00 AEST 1985
I've had a LOT of trouble with RA81 discs. I run a facility with eight RA81s
(along with RM80, RA80, RA60, & RL02 discs) distributed among six VAX 11/750s.
Of the eight 81s, I have had to replace four HDAs in the past year, and I may
have another dead one on my hands.
The Symptoms: usually start out with soft errors, status/event codes of
053, 0353, and sometimes 0213 and 0350. Most of the time, these are
followed by hard errors. The errors usually become more severe (more
frequent; proportion of hard errors increases) if the disc is left
in service. The problem has occurred under three versions of UNIX.
The Cause: unknown. I observe that the errors appear on a disc that has
suddenly seen a lot of write activity, after performing reliably for
months (e.g., convert to a new version of UNIX and restore data).
The Diagnosis: DEC diagnostics (EVRLA) sometimes detect a problem (hard
error), sometimes not.
The Remedy: attempt to reformat the disc. This has succeeded (and cured
the problem) four times. To date, reformatting has failed four
times; these HDAs have been replaced. Reformatting has failed in
various ways: usual complaints are failure to format LBN area,
failure to format DBN or XBN area. I have also seen a complaint
that more than 12.5% of a track is bad.
One Field Service Hypothesis: is that UNIX trashes DEC's area of the disc
when it encounters a bad block, clobbering tables needed to reformat.
Only twelve (hard) errors were reported on the latest RA81 before we
attempted to reformat - reformatting still failed.
What I Believe: I've heard that RA81s have been developing bad spots in the
field. (This is consistent with the war stories I've been trading
with friends.) UNIX doesn't forward the bad blocks, so the most
attractive cure (?) is to reformat the disc. Apparently the formatter
used at the factory is more powerful than the one available in the
field; if DEC's area of the disc is bad, the field formatter can't
recover. The HDA must be replaced; the old HDA is returned to the
factory for reformatting, or marked usable for a VMS site (VMS does
dynamic bad block forwarding).
Pat Parseghian
Princeton Univ. EECS Dept.
More information about the Comp.unix.wizards
mailing list