SCO OpenDesktop Crashing With Weird Disk Problems

Eric Johnson erc at pai.UUCP
Fri Sep 7 01:31:45 AEST 1990


Help!

I've been having some terrible problems with SCO's OpenDesktop 1.0.
I'm not sure if these are hardware, software or both. And, I'd
appreciate any help from the net. (Please note that I really don't
blame anyone but myself and that any and all help is requested.
Thanks.)

My system:

SCO ODT 1.0, X11, Motif, DOS, TCP/IP, Software Dev.
Avex 386 mainboard 25 MHz
Adaptec 2322-16 ESDI disk controller
Paradise VGA Plus 800x600x16
Western Digital 8003EBT Ethernet (thin, and the only system on my own net)
Imprimis Wren 6 320 MB disk
8 MB RAM
Phoenix BIOS
Logitech serial Mouse (latest rev)
Relaxed security defaults


I normally run the X Window system and use the box for developing
programs and writing for my next book.  My default config is two 
large xterms and one xclock, under the Motif window manager, mwm.

1) I cannot seem to be able to run the system with "heavy" use for more
than four hours. (I'm developing Motif programs).  During a major make
session, running the C compiler (stock cc), I'll see a message like
"Killed."
or 
"Signal receieved"

(I'm not typing ANYTHING at all during this time.)

Then, the X server usually freezes and the only thing I can do is
Alt-Sys Req to trash the X server (and my compile processes).  When I
get back to the console (I sure wish xterm -C worked, so I could
see console messages under X!), the screen is filled with hard disk
errors.  These errors keep getting worse, and generally I have to hit 
the hard reset button.  Now, this is a brand new system, but I 
never rule out hardware (e.g., disk) problems. These disk errors
are continuous and all the system seems to be doing is printing
these errors to the screen.

When I reboot, though, fsck seems to fix all the disk problems. So, 
the hard disk bad track errors don't seem to me to really be bad
tracks, unless fsck isn't really fixing the situation. fsck has always
been voodoo to me, but it has always seemed to do the job on the
many versions of UNIX I've used.

I'm using an Adaptec ESDI controller and an Imprimis Wren 320 MB disk.
Any ideas as to what is causing this? Is it probably hardware, or
could it be in the software, too?

2) (Related to #1, above): A Motif program I wrote, which normally works
fine (its just a test of the Scale widget and it works fine on a number
of UNIX workstations), all of a sudden was killed, like above. The
disk then went berserk, so I did the infamous Alt-SysReq to trash the X 
server. At the console, I again saw streams of disk errors.  When the
system rebooted, and I tried to run my test program, it didn't run.
Instead, it looked like it ran dfspace (a df variant that SCO uses).
Now, whenever I start an xterm, OpenDesktop (ODT) seems to run dfspace
in the new window, so I suspect this is in the system-wide .login
or some file like that. Anyway, my executable did something other
than it had ever done.  Anyone ever seen anything like this?
I deleted the file, since I didn't like what happened.

3) One of the times the #1 stuff happened, the ttys data base
(part of system security) got trashed, so only the superuser (root)
could log in.  An SCO Tech support person led me through the process
of pulling in a ttys file from the distribution floppy (the ODT manual
has a great section on fixing this problem, but it assumes that at
least one ttys* file exists, which I didn't have).  Note: the SCO
Tech Support folks are great (once you actually get to talk to them).
I've called them a number of times and they've always helped out with
very good advice.  The main problem here wasn't the lost file, but:
   a) A trouble-shooting section in the manual that dealt
   with the problem but made too many assumptions to be actually
   workable.
   b) The implications of ever using a product that has so many
   weird ("weird" as in not in other versions of UNIX I've seen)
   files which are required to use the system.  This has bad 
   implications for my employer adopting this product (see below).
 

4) One of the times the above (#1) stuff happened, one of the C
compiler executables got deleted (/lib/386/p2_286).  So, to recover,
I ran the custom program to pull that file in from the distribution
floppy diskette. I had two main problems with this:
   a) Every time I try to install one file, custom brings over the
   file just fine (I think), but then custom always dies with an
   "Internal Error: 10#".  In errno.h, error 10 is related to
   calling wait on a child that doesn't exist, I think. What exactly
   is custom doing that causes it to die so ungracefully?  Can
   anyone bring over single files from the ODT dsitribution
   disks using custom?

   b) Once I had the infamous /lib/386/p2_386 file, I still could
   not compile anything. Why? Because the /lib/386/p2_386 program
   wasn't "serialized" (a part of SCO's copy protection scheme).
   Now, how can I "serialize" one single file?  Remember that
   custom dies for me every time I try to install single files,
   so I never get to the serialization phase from custom (like I
   did when I first installed this stuff).  I tried RTFM-ing,
   but I didn't find any mention of how to serialize one file.
   Anyone know how?  Even if I my main disk problem is hardware-
   related, this is a serious issue.  I don't really mind SCO's
   copy-protection scheme (which is also very much like Interactive's),
   but, a copy protection scheme should be aimed at preventing
   unauthorized users, not AUTHORIZED users!  When copy protection
   schemes get in my way, I tend to drop the products.  

   So SCO (and Interactive, too, since you have a CP scheme as well),
   listen up:  All this (above) is for my own private system,
   but during the day I work in R&D at Boulware Technologies
   (see signature below) and BTI provides industrial automation
   systems.  We expect things like files to get trashed out in 
   the field. We also demand the ability to recover from things like
   this.   This last week, I was asked to evaluate 386 UNIXes
   for BTI.  (A 386 running UNIX is generally cheaper than 
   a full-blown UNIX workstation, especially since BTI puts together
   their own 386 clones.) I had to state that I did not think
   that ANY 386 UNIX has evolved to an acceptable level yet.  That
   is, installation is too hard and fraught with problems (it only
   took me 11 full tries to get SCO ODT installed; I've given up 
   for now on ISC 2.2), system administration is also too hard and
   especially for SCO fraught with all sorts of security-related
   issues, and I generally don't have confidence that these versions
   of UNIX will run under demanding conditions in the field (with
   users who aren't very UNIX-literate).  In other words, I feel that
   Hewlett-Packard and Sun (for example) have a much stronger
   software product than either SCO or Interactive and that I
   do not have the necessary confidence in SCO or Interactive to
   recommend their products yet.   I do not mean for this to be
   a bitch session, so please don't take it as such. And yes,
   I do understand that 386 UNIXes must support a vast array of
   not-so-compatible hardware options, so there are more problems
   to face on a 386.  I want you SCO and Interactive folks to 
   take this constructively. I'd love if your products improved
   (and yes, I have seen them improve thus far).  I'd love to have
   the confidence in your products, because that would mean a
   substantial cost savings for my employer.  But, I just don't
   feel the products are there yet.

   I finally had to re-install the ODT basic software development
   package to be able to return to a state where I could compile
   C files.  yech-o.

5) How does one change to single-user mode without changing your
system forever?  I always try to run custom in single-user mode,
so instead of bringing the whole system down and then re-booting
(due to time, as I was on the phone to SCO Tech Support at the time),
I tried:
   shutdown -iS -g0 -y

That is, shutdown to run-state S (single user), right now (-g0)
and yes (-y) I want to do it.

After doing this, the system console changed from /dev/tty01 to
/dev/syscon (which meant I had to change my X start-up scripts
in .login), and the root user is always asked:

TERM = (ansi)

This never happened before I ran that one shutdown.  What has really
happened to my system and why did it change forever from one
shutdown?  I've always been used to the idea that shutting down
to single-user mode should just do that and not irretrievably
change your system when you reboot back up to multi-user run-
state. That is, if you boot to single-user run-state, this should
be the same as shutting down to single-user run-state. Single-user
run-state should be single-user run state.

6) Just about every other time I start the X Window server, I
get screen jitter mode. That is, the screen jitters vertically
so fast (basically moving every pixel up and down about 1/4 of
the height of the screen).  This, obviously, makes X totally
unusable.  Usually, I need to stop X, then logout and then
restart the X server.  Normally, everything works fine then.
Mostly, it goes bad every other time, although somethimes more
often and sometimes less often.  Any ideas?  I'd love to have
it work right every time, of course.


If anyone has any information on any of these topics, I'd appreciate
email (or a post if you're so inclined). I'll summarize the email
responses I get for the net. If you suggest RTFM, please point out
which manual and which section. I'd love to get this box where I
can spend a whole day working on my book and not wasting hours
trouble-shooting my system.



Thanks,
-Eric

erc at pai.mn.org

  
   

-- 
Eric F. Johnson               phone: +1 612 894 0313    BTI: Industrial
Boulware Technologies, Inc.   fax:   +1 612 894 0316    automation systems
415 W. Travelers Trail        email: erc at pai.mn.org     and services
Burnsville, MN 55337 USA



More information about the Comp.unix.sysv386 mailing list