Monitoring processes and machines (program itself and central)
Brian J. Hafner
hafner at mysost.cs.wisc.edu
Sun Mar 31 11:06:52 AEST 1991
In article <3545 at inews.intel.com> khougland at sedona.intel.com writes:
>
>I'm intrested in being able to keep tabs on our whole domain. That way, when
>people log off for the day; it's usable CPU time! The unfortune problem is
>that sometimes the programs crash and burn by themselves and sometimes ye old
>operator does a kill -9 one them.
You may be interested in "condor" from the Univ. of Wisconsin.
A portion of the condor_intro man page:
Condor is a facility for executing UNIX jobs on a pool of
cooperating workstations. Jobs are queued and executed
remotely on workstations at times when those workstations
would otherwise be idle. A transparent checkpointing
mechanism is provided, and jobs migrate from workstation to
workstation without user intervention. When the jobs com-
plete, users are notified by mail.
Condor may be obtained via anon-ftp from shorty.cs.wisc.edu
Brian J. Hafner
Computer Sciences Department
University of Wisconsin - Madison
hafner at cs.wisc.edu
More information about the Comp.unix.internals
mailing list