VAXclusters and UN*X
Paul Jensen
jensen at decwrl.UUCP
Wed Jun 5 08:28:58 AEST 1985
Following is a very brief tutorial on VAXclusters, and how they relate
to unix*:
A cluster is defined by a set of proprietary protocols for implementing
a loosely-coupled multi-processing system. Two of the key protocols
are System Communication Services (SCS), software which defines and coordinates
members of the cluster; and the Distributed Lock Manager, which allows
locks to be shared between processors.
These protocols are entirely software-based there are no hardware
dependencies in them except at the lowest levels. Also, the
protocols are such that control is distributed dynamically between
members of the cluster; in fact, there is no such thing as a
"cluster controller" (the HSC50 is logically a peer of the VAX
processors).
The HSC50 is a high-speed IO server. It services requests for
logical disk blocks. It does not know anything about file structure:
this is imposed by the VAX processors via the MSCP protocol.
The HSC50 performs various sorts of optimizations (similar
to those done by the FFS) and has a peak transfer rate of nearly
4MB/sec.
The RA-series disks are not dynamically dual-ported. Dual-porting
was implemented in RA disks for the purpose of allowing the disk
to be accessed by a secondary controller in the event the primary
fails. In a cluster, a typical configuration would be a disk
dual-ported between either 2 HSC50s or an HSC50 and a UDA50.
Only one path will be active: in the event of the HSC50 failing,
the alternate path will be dynamically failed-over to.
DECnet is totally unrelated to clusters. It is possible to run
DECnet over a CI bus (using SCS), but a cluster can
run fine without a byte of DECnet code (it IS extremely useful for
system management, however).
Allowing a unix (or any other) system to participate in a cluster would
require implementing at a minimun SCS, the connection manager (software
which decides when to form, change, and dissolve clusters), the
distributed lock manager, and MSCP. This is a large amount of
code, much of it embedded in VMS (and therefore subject to VMS
licensing restrictions), and porting it would be a major
undertaking. A major re-write of the file system would be necessary,
and adopting some sort of standard for file locking would be
highly recommended.
All the above work would just give you a distributed file system.
If you wanted distributed job and device queues, you would
have to implement the Distributed Job Controller as well. Given
the VMS-ish flavor of this protocol, this task might be distasteful,
not to mention non-standard.
In conclusion, the bottom line shakes out as follows:
o "cluster" of homogeneous UNIX systems with distributed
file system only: technically feasible but a lot of
work (>> 1 man-year).
o the above with distributed queues: more work, problems
with maintaining a standard version of unix
Regards,
--- Paul Jensen
Digital Equipment Corporation
------------------------------------------------------------------------
Disclaimer: All information in this response is drawn from public
sources. All opinions expressed are solely my own.
In particular, I haven't the faintest idea of the
future or current plans of either Ultrix or VMS
engineering.
*unix is a trademark of AT&T.
More information about the Comp.unix.wizards
mailing list