load sharing
M Gordon
mfg at castle.ed.ac.uk
Thu Feb 7 20:20:19 AEST 1991
fwp1 at CC.MsState.Edu (Frank Peters) writes:
>: On 6 Feb 91 16:22:07 GMT, pjw at usna.navy.mil, , jw at math30, (Peter J. Welcher (math FACULTY)) said:
>pjw> The question is, is there any easy way to perform load-sharing, other than by
>pjw> randomly assigning sections or students to hosts ?
>I once toyed with an idea to do something like this using DNS but
>never implemented it.
>Basically the idea was to define a new record type in my local DNS
>tables called PROG that would run the given program and return the
>result in an A record to the calling program.
...
>I think this idea has the following advantages:
>1. I'd be willing to bet that the necessary modifications to bind
> would be relatively trivial.
>2. Since all that ever gets returned is an A record no modifications
> are required to the world wide DNS system or to individual
> resolver clients. And no front end host beyond the nameserver
> would need to be involved...none of this 'telnet to machine A and
> let it decide where you should go' stuff.
>3. The actual load program can be upgraded/replaced/modified with no
> changes to the bind code. I can make leastload return a random
> host as a first pass, then the least number of users later, then
> the least loaded cpu and so on for finer levels of balance. The
> two tasks (picking a destination and returning it to the user) are
> isolated. I always did like modularity.
>Any comments on this idea? Any reason why it would be especially
>difficult/impractical?
>Anyone who has actuall done this?? :-)
I implemented a similar idea for our network of suns. Named has been
altered to recognise "sun3" and "sun4" as special cases and use RPC to get
a hostname from a server. There were several reaons for doing it this way,
rather than having named doing the polling itself.
If a machine is down named would hang until the poll of the
dead machine timed out, stopping it responding to other calls.
As well as the terminal servers using DNS for name lookup we have
some Bridge terminal servers which use their own name server
machines. The primary server for these is set to a Bridge box,
the secondary to the address of the server. The primary server
will not recognise the name "sun3" so it will be passed to the
server to reply with an address or "name unknown" if it is not
a request for "sun3" or "sun4". The same server can respond to
both RPC requests from named and Bridge boxes.
We still have some people with serial lines into Vaxes. These
lines are running a modified getty. Instead of /bin/login the
modified getty runs a small program which makes an RPC request
to the server and execs an rlogin to the machine returned. This
part of the system will gradually disappear as the Vaxes are
retired and we move people onto the terminal servers.
The server is actually two programs, one which does the polling
and puts the results into a shared memory segment and the other
which responds to RPC requests. This means that the response to
a request is immediate, even if the polling program is waiting
on a dead machine. It also makes it possible to use the
information gathered for other purposes. e.g. a screen in our
machine room shows the load average of all our suns and the name
starts flashing if a machine dies, letting us monitor the state of
machines all over the building.
Michael
--
_ _ _ _ _
Michael Gordon - mfg at castle.ed.ac.uk OR ee.ed.ac.uk | |_| |_| |__| |_| |
| . . . . . . |
I spilt spot remover on my dog and now he's gone! |_________|~~|_____|
More information about the Comp.unix.internals
mailing list