Mysterious Sun-4 bug
Hugh LaMaster
lamaster at pioneer.arc.nasa.gov
Fri Jun 28 02:26:31 AEST 1991
I previously wrote:
>The bug has appeared in 4.1, 4.1 + various patches (almost 4.1.1), 4.1.1,
>with and without DBE installed, with and without FDDI (ie, with NFS
>traffic over ethernet). The same symptom has appeared in all cases:
>a process which is usually doing NFS I/O will hang in "D" state. The
>offending process cannot be killed, and eventually other processes
>start hanging as well. During this period, Sybase activity
>will have been very heavy. The Sybase datasever process itself, however,
>never hangs (note: Sybase is set up so that its I/O is local, *and*
>Sybase is using its own raw partitions). Even though Sybase itself
>never hangs, *If Sybase asych. I/O is turned OFF,
>the problem rarely if ever appears.*
1) We are not running with /tmp in swap with tmpfs. However, I understand
that this can cause a similar sounding problem, which may be related. It
could be a bug somewhere in the allocation of swap space.
2) I should have made it clear that the Sybase raw partitions are local
to the machine with Sybase, and are not doing NFS on the Database files.
Only user-type files are mounted off of the fileserver using NFS. Also,
lockd and statd are not running. I believe that there is no need for
them to be running, since Sybase is not reading/writing over NFS, and
is not complaining about lock requests failing.
3) We had another hang yesterday afternoon. The processes which hung
this time looked like the following:
F UID PID PPID CP PRI NI SZ RSS WCHAN STAT TT TIME COMMAND
200080001002 9562 9542 0 -1 0149376 0 kernelma DW pa 0:00 model
200080011002 9529 4227 0 -1 0149376 72 kernelma D pb 0:00 model
A pstat -Ts showed the following:
[149] pstat -Ts
>pstat: number of files is preposterous (14019)
>1470/1470 inodes
>454/4090 processes
>460952/781032 swap
>
We have a lot of swap space allocated, to run some of these big jobs.
--
Hugh LaMaster, M/S 233-9, UUCP: ames!lamaster
NASA Ames Research Center Internet: lamaster at ames.arc.nasa.gov
Moffett Field, CA 94035 With Good Mailer: lamaster at george.arc.nasa.gov
Phone: 415/604-1056 #include <std.disclaimer>
More information about the Comp.unix.wizards
mailing list