First cut at Batch PAR
Jim Isaak
isaak at decvax.dec.com
Fri Jan 5 10:35:30 AEST 1990
From: isaak at decvax.dec.com (Jim Isaak)
[ Here is a preliminary draft Project Authorization Request (PAR)
for a Batch Processing subcommittee of IEEE 1003. It is accompanied
by a preliminary paragraph supplied later by Jim Isaak. I will be
posting similar draft and actual PARs and related procedural material
as they reach me. -mod ]
I would suggest that you add some intro information for the
uninitated. Both "fact" and "status" -- for example, the Batch PAR is
the first proposal we have seen, and it is likely to change before
we agree to "sponsor" work in that area (so suggestions for change are
now very appropriate!) --- I would expect a more mature version just
before the SEC approval meeting (very little time for comment, but still
not "approved") -- then after SEC approval it is time to let the world
know we are soliciting participation and input in that area (no longer
time for comment on PAR contents in general)
jim
PAR Proposal for Batch Processing TCOS SEC N117
Karen Sheaffer Jan. 4, 1990
Overview
Supercomputing applications, by definition, have massive resource
requirements. It is not unusual for applications to require all
available memory, gigabytes of disk space, and still take many hours,
days, or weeks to complete. A batch processing system that can allocate
and manage system resources among dozens of jobs to allow the efficient
execution of such jobs is essential.
The preparation of supercomputing jobs for submission is often a
complicated task carried out on network nodes other than the supercomputer,
e.g. workstations, front end processors, and minicomputers. A batch
processing system must permit supercomputer job submission from
these network nodes and the spooling of output to the network.
UNIX systems have primitive batch capabilities (at, cron), but these are
not adequate for production supercomputing environments. These facilities
may suffice in a simple environment, but they make no provision for
overall management of a workload running under UNIX. It is easy to create
a situation in which a number of processes compete for limited resources,
substantially increasing system overhead.
The IEEE 1003.10 Supercomputing Working Group has been developing
a proposed standard for a batch processing system based on NQS, the Network
Queuing System originally developed at NASA Ames.
Scope
The standard will define the system interfaces, utilities, system
administration interfaces, and an application level protocol required by a
network batch processing system in a POSIX environment. This standard will
provide portability for applications, users, and system administrators.
Purpose
The purpose of this standard it to extend POSIX to provide a network batch
processing system. These extensions include the following:
system interfaces
checkpoint/recovery-
the capability of a user session or process to
automatically checkpoint itself periodically and
to restart at the latest checkpoint following a
machine crash or shutdown. The objective of
checkpoint/recovery is to avoid the expense of
rerunning work requests that may have been executing
several hours or days prior to a machine crash.
resource control-
the ability to control the allotment of the resources
of the machine (such as cpu time, memory,disk space,
tapes etc.) to a process/session.
utilities for the submission and management of the requests
system administration interface for the creation and authorization
of the network batch processing system
network application level protocol
Name of Group which will write the Standard:
POSIX 1003.10 Supercomputing Working Group
TCOS-SEC Checklist for New PAR Activity Proposals
I. Administration
Karen Sheaffer Sandia National Laboratories Chair
Stuart McKaig Convex Vice-Chair
Jim Tanner Boeing Computer Services Technical Editor
John Caywood Unisys Secretary
(Note with the exception of Stuart McKaig, all of the above have the same
positions in the 1003.10 Working Group)
II. Working Group
# of active (have attended 3/4 of meetings) participants 15
# of correspondent members identified: 50
Breakdown of active participants: Producer: 5
User : 10
Other :
# of companies/interests represented: 14
What international participation has been identified ?
III. Deliverable Document
Standard
Expected Size 200 pages
Projected time frame:
First Draft: July 1989 Start Balloting: Fall 1990
What candidates exist for a "base document"?
The 1003.10 Supercomputing Working Group Draft Batch Document
Network Queue System (NQS) public domain software and
documentation
IV. Scope
See above
V. Overlap/Dependencies on other work
Which TCOS standards assumed: 1003.1 and 1003.2
What functions are required by other groups: Protocol Independent
Network Service for Portable Applications
What other groups are doing work here:
Volume-Number: Volume 18, Number 20
More information about the Comp.std.unix
mailing list