8800 crashing way too often
Alan's Home for Wayward Notes File.
alan at shodha.enet.dec.com
Thu Nov 1 02:41:12 AEST 1990
In article <STERGIOS.90Oct29193129 at kt22.Stanford.EDU>, stergios at portia.Stanford.EDU (Stergios) writes:
>
> [ Customer has a VAX 8800 crashing very frequently. ]
I have a VAX 8800 that crashed 96 days ago. That's the
last time it was down. The time before that was 80 days.
The I/O configuration is three VAXBIs with two KDB50s
(2 RA90s each) and CIBCA with HSC70 and a bunch of disks.
There's a DEBNI and DMB32 in there somewhere. This kind
of uptime seems to be typical for my system. Use it for
comparison purposes.
>
> Quite a number of dec people have and still are looking into the
> problem. Every board has been replaced, even a new bi bus installed.
> dec software engineering is leaning towards a problem in the mscp
> code.
Is it the same error each time, a different one? Which
one Panic, machine check or "it stops". What version of
ULTRIX are you running? If V4.0 has you installed and
booted the mandatory upgrade? Any non-DEC devices on
the VAXBI or KDB50s? Is there a UNIBUS on the system?
Does it have anything important on it? Could it be
replaced by a native VAXBI device?
>
> Weve installed and ran 8 different dec supplied debuggers inside the
> kernel. Each one never tells what the problem is, only what the
> problem is not. Progress, I suppose.
>
> It originally took a couple months to escalate the problem to the
> point where we got attention. Now we have attention to the point of
> twice weekly meetings with dec sales staff regarding our 8800
> crashing. lots-o-fun, but we still have a poorly performing machine.
A couple of months feels to long for me, but it depends on
the situation.
>
> There is talk of replacing out kdb50's with HSE's in the hope that the
> problem will disappear. This seems reasonable, I guess, but sounds
> like a desperation move at this point.
Find the problem first. There is one out there somewhere
and it is findable.
>
> Now we are starting to talk replacement systems (this is another story
> all together, probably worse, and I wont air that kind of laundry in
> public) and dec is pushing a 5500 at us. I dont think the 5500`s
> q-bus is going to take the beating our 8800 does. we are currently
> running a 5400 as an optional machine to the 8800, and the poor little
> thing is choking. I refuse to install ada and a number of other
> packages on it becuase of its performance so far under our
> environment. This does not make our clients any happier: a machine
> not runinng the necessary software is not any better than a crashed
> machine, and we have plenty of both.
Actaully most of the interesting I/O on a DECsystem 5500 will
stay off the Q-bus unless you insist upon using KDA50s for
most of the disks. A couple of gigabyte SCSI disks and DSSI
disks should be very impressive. A VAX 8800 is good for
moving bits between disk and memory, but a well configured
DECsystem 5500 should be able to do better. You'll need more
memory to make up for the VAX to RISC switch.
>
> Are there any other buses or solutions available on the 5500? I'm
> asking here cause I've already been told "there is this neat way to
> hook up a ra92 as a swap disk avoiding the qbus that gives an extra
> M/s" by the sales types. An extra M/s over the qbus is not going to
> cut it for us.
There are three places to connect disks to a DECsystem 5500;
one or more KDA50s on the Q-bus, the DSSI adapter and the
SCSI adapter. The only place >>>I<<< know of to connect an
RA{anything} is the KDA50. Find out what your sales critter
is talking about. If you go to a DECsystem 5500 you'll almost
certainly want to switch from the RA{anything} to RFs or RZs
or least put move some of the I/O load off the RAs.
>
> What good is a maintenance contract? Are we being too lenient with DEC
> by letting them drag this out as far as they have?
I'll put it this way. You've been very patient. I wouldn't
have been that patient.
Of course it also depends on what level of support you have.
An 8 hour a day, 5 days a week Basic support contract is a
very different beast from 24x7 DECsupport. Each contract
has time limits for how long things are allowed to "drag out".
I don't think any of them are months though.
>
> Any and all suggestions welcome.
>
Tell us more about the errors in the hopes that we might
recognize the problem from previous experience.
> sm
> stergios at jessica.stanford.edu
--
Alan Rollow alan at nabeth.enet.dec.com
More information about the Comp.unix.ultrix
mailing list