Backups Query
Shawn F. Mckay
shawn at mit-eddie.MIT.EDU
Sat Mar 7 04:49:17 AEST 1987
Howdy, I have a few ideas I wonder if people would be willing to ponder,
and perhaps lend a hand with, I'm very open to totally new ideas as well,
ths basic idea is to make backups at our site easy, simple, and reliable.
Thanks for any help.
Ideas for system backups;
The problem --
We have 1 main system, (a vax w/ra's), and two clusters of little
systems. (Different types, but they look like workstations). Within these
clusters, there are critical machines, and non-critical machines.
The critical machines need to be backed up on a nightly basis, and the
non-cricical machines need to be backed up on a weekly basis, or on
user request. (Being development machines, it would be nice to have them
backed up when people to some amount of work, rather then all the time).
Resources --
We have two operations people, who have to make sure other things keep
working as well, which limits the time spent on backups, to something
less then 50%, (Would be nice to chop that number down below 10%).
We have as many mag tapes as it takes, but would be nice to cut this number
down as well, so it takes less time.
---- The soloutions so far ----
Backup type (a)
Procedure --
o Do a full dump of the main system once a week, with incrementals
done each day.
(Difference being mainly physical/method)
o Do a full backup of all remotes's each week, some each day, and
incrementals each night.
Pros --
o Given the systems listed below, I can't really see any.
Cons --
o Great in number in size, I think they are obvious, the main being
number of tapes/ manhours/ and obvious cpu usage.
Backup type (b)
Procedure --
o The main system has two sets of file systems, (primary/all),
the primary set is backed up weekly, and the whole system
is backed up monthly, (i.e. 'all'). Incrementals are done on
all file systems at level 9 daily. If they expand to more then
1 tape, a full dump of all file systems becomes needed.
o The remote systems each have there own cron script to initiate
backups to there individual tape drives, and do so on a regular
basis as is needed by that particular system, reporting errors
to a human, but otherwise being quiet. (This for incrementals,
we still need to save full dumps more then a night).
o Critical remotes may optionally send some data to our main system,
or perhaps shadow something in compressed format to another remote.
Pros --
o We cut out a great deal of human intervention
o We gain reliable, tested, backups.
o Done at night, so minimal cpu loss
o Done with a tape for remotes, so minimal tape use,
except for full dumps, which still have to get there own tape.
Cons --
o The tape drive(*s*) must work
o Each machine, MUST have it's own drive.
o The need for high quality tapes comes up fast, since
they will be left sitting in the drive all day in most cases.
o The potential to write over a good tape which someone didn't
remember to swap out of the drive exsists.
o The potential for someone to forget to put a tape in the drive
on a critical system exsists.
(I'm sure if I want to nitpick, I could add more lines to this).
Backup type (c)
Procedure --
o Procedure is complex, I'll explain by players -
Main system will be called master, and remotes slaves. (Original eh?)
Master will query slaves each night to ask them to give it an idea
of how much data has been modified that day, and what total bytage
it needs to have saved.
When the slave replies with a number, they master then decieds if this
is a full, or incremental save time, based on knowing how much data
is reasonable to save with an incremental. I have allways felt that if
you have to save more then a third of the disk with an incremental it's
time for a full dump. (This makes it easier to restore).
The master then has several options, based on how each system was
backed up last.
a) Save the incremental data to it's drive somewhere,
or to a designated host on the cluster to store such
information. I'll call this type of host a 'buddy',
since it would be saving information for it's buddy.
Every system opn the cluster is a buddy for at most 2
systems, but it could be any 2 systems based on how
much space that buddy has left.
b) (B was in a, wasn't it? Oh well).
c) The master could also decied to save data to it's own
tape drive, which works well as an option, but would
probably be an 'overflow' option, more then a regular
option.
Alot of what will make this system better then most, is the master
slave relationship, for example, if master tells slave 'save leve 9
to /dev/mt0', (for it's tape drive), and slave says 'cant-offline',
then the master can reissue the next way to save the level 9, by
saying something like 'save level 9 remote host', where host is the
name of a buddy that master has checked to see has the space for it,
and then the slave sends it's level 9 dump image.
Pros -
o As stated above, it's got a stronger will to work.
o Less human intervention, although to make sure people
know where everything is, it must have strong/clear
event logging.
o Less tapes, since it only uses tapes as part of a cycle of
data preservation, making it very hard to lose a great deal
of anything, since if the tape dies, it might be on a buddy,
of the master may have a copy.
o Automation, just ask the master to get you the latest copy of
file 'x' from system 'y', and let it deal with where it
put the file.
o Room for expansion, if you add a new remote, or new type of remote,
all you have to do is define it to the master, in what should be
a simple text database, and write the interface for the remote.
Cons --
o A system this nice has bugs, and takes a while to write, if you
wan't to do a good job.
o Once up, it would require people to read the manual.
o If the master is down, all hell breaks loose, right?
Wrong. As I didn't remember to mention, (and at 300 baud, will
mention right here), if the master goes down, 1 node from each
cluster, should have a copy of the main systems database, and
a program to allow it to become an emergency master, (but not
a long term master, because this would lead to chaos).
o I'm sure there are more, but that's why I'm asking for comments.
Backup type (d)
------------------------------------------------------------
** This space left blank for your very welcomed ideas **
------------------------------------------------------------
Final comments;
I would also like to use data compression in some step before data gets
written out to a real tape, it's unclear what the tradeoff's are, I
would expect that to lose a bit, in a high compression tape, would be
a problem, to use a low compression method, would be useless, so comments
here are welcome also.
Thanks in advance,
-- Shawn
Reply paths;
----------------------------------------
Usenet: mit-eddie!shawn, think!ima!haddock!shawnm
Arpanet: Shawn at Mit-Mc, Shawn at Mit-Ai
Internet: shawn at eddie.mit.edu, shawn at borax.lcs.mit.edu
Chaosnet: Shawn at Mit-eecs, Shawn at Mit-eddie
More information about the Comp.unix.wizards
mailing list