Kernel core dumps (was Re: out of swap space??)

Sun May 5 01:15:38 AEST 1991

chap at art-sy.detroit.mi.us (j chapman flack) writes:
>This reminded me of questions I've been meaning to ask.  I never knew where
>the kernel core dump goes in a panic (and so far I've had no opportunity to
>find out....).

	This is the heart of the matter.  IMHO, SCO has done a pretty good job
of hammering their product into a state where it just runs and runs and runs
with little ado.  If you had a choice between a system that had a very nice and
powerful crash dumping and analysis system, and one that simply didn't crash in
the first place, which would you pick?

>  At what point does the kernel begin using the swap area on the next boot??
>  How am I able to use `crash' to examine the core dump before the evidence
>  is overwritten?

	The rest of these comments are from my ESIX system.

	As you boot up the script /etc/dumpsave is called and
goes about copying the crash dump to another place.  It is invoked by
/etc/bcheckrc if fsstat on the root device indicates that the root filesystem
needs cleaning (which indicates some sort of crash in the first place).
I usually see this message after a powerfail, so it's offering to save a dump
that doesn't even exist.  Oh, well.

	/etc/dumpsave is kind of a crock.  It's hard-coded to dump to
some sort of floppy/tape device.  I guess they didn't want to deal with
getting the other filesystems mounted first.  There'd be a definite danger
there, as fsck could well scribble on the swap area.

	Finally, they give you /etc/ldsysdump to copy these same floppies
back into the filesystem.  You run this after you get your system back up and
have clean, mounted filesystems to put the crash in.

>It would have been handy to be able to run something as root that
>forces a panic, then reboot and analyze the dump while the system is still
>reasonably reliable.

	Another strategy would be to run /etc/crash in one window and then
switch to another and run your programs.  When things get bad, switch back
and look around on your running kernel.  By having /etc/crash already running,
your inode, etc. shortages shouldn't keep you from looking.  Just an idea.
(In case you hadn't tried this, running crash without arguments
makes it run on /unix and /dev/mem, which means you're looking at the state of
your running system.)

						Regards,
						Andy Valencia
						vandys at sequent.com

Disclaimer: these are just my opinions, one and all.