Non Destructive Version of rm
Jonathan I. Kamens
jik at athena.mit.edu
Mon May 6 14:15:12 AEST 1991
John Navarra suggests a non-destructive version of 'rm' that either
moves the deleted file into a directory such as
/var/preserve/username, which is periodically reaped by the system,
and from which the user can retrieve accidentally deleted files, or
uses a directory $HOME/tmp and does a similar thing.
He points out two drawbacks with the approach of putting the deleted
file in the same directory as before it was deleted. First of all,
this requires that the entire directory tree be searched in order to
reap deleted files, and this is slower than just having to search one
directory. Second, the files show up when the "-a" or "A" flag to ls
is used to list the files in a directory.
A design similar to his was considered when we set about designing
the non-destructive rm currently in use (as "delete") at Project
Athena and available in the comp.sources.misc archives. There were
several reasons why we chose the approach of leaving files in the same
directory, rather than Navarra's approach. They include:
1. In a distributed computing environment, it is not practical to
assume that a world-writeable directory such as /var/preserve will
exist on all workstations, and be accessible identically from all
workstations (i.e. if I delete a file on one workstation, I must be
able to undelete it on any other workstation; one of the tenet's of
Project Athena's services is that, as much as possible, they must
not differ when a user moves from one workstation to another).
Furthermore, the "delete" program cannot run setuid in order to
have access to the directory, both because setuid programs are a
bad idea in general, and because setuid has problems in remote
filesystem environments (such as Athena's). Using $HOME/tmp
alleviates this problem, but there are others....
2. (This is a big one.) We wanted to insure that the interface for
delete would be as close as possible to that of rm, including
recursive deletion and other stuff like that. Furthermore, we
wanted to insure that undelete's interface would be close to
delete's and as functional. If I do "delete -r" on a directory
tree, then "undelete -r" on that same filename should restore it,
as it was, in its original location.
Navarra's scheme cannot do that -- his script stores no information
about where files lived originally, so users must undelete files by
hand. If he were to attempt to modify it to store such
information, he would have to either (a) copy entire directory
trees to other locations in order to store their directory tree
state, or (b) munge the filenames in the deleted file directory in
order to indicate their original locationa, and search for
appropriate patterns in filenames when undeleting, or (c) keep a
record file in the deleted file directory of where all the files
came from.
Each of these approaches has problems. (a) is slow, and can be
unreliable. (b) might break in the case of funny filenames that
confuse the parser in undelete, and undelete is slow because it has
to do pattern matching on every filename when doing recursive
undeletes, rather than just opening and reading directories. (c)
introduces all kinds of locking problems -- what if two processes
try to delete files at the same time.
3. If all of the deleted files are kept in one directory, the
directory gets very large. This makes searching it slower, and
wastes space (since the directory will not shrink when the files
are reaped from it or undeleted).
4. My home directory is mounted automatically under /mit/jik. but
someone else may choose to mount it on /mnt, or I may choose to do
so. The undeletion process must be independent of mount point, and
therefore storing original paths of filenames when deleting them
will fail if a different mount point is later used. Using the
filesystem hierarchy itself is the only way to insure mount-point
independent operation of the system.
5. It is not expensive to scan the entire tree for deleted files to
reap, since most systems already run such scans every night,
looking for core files *~ files, etc. In fact, many Unix systems
come bundled with a crontab that searches for # and .# files every
night by default.
6. If I delete a file in our source tree, why should the deleted
version take up space in my home directory, rather than in the
source tree? Furthermore, if the source tree is on a different
filesystem, the file can't simply be rename()d to put it into my
deleted file directory, it has to be copied. That's slow. Again,
using the filesystem hierarchy avoids these problems, since
rename() within a directory always works (although I believe
renaming a non-empty directory might fail on some systems, they
deserve to have their vendors shot :-).
7. Similarly, if I delete a file in a project source tree that many
people work on, then other people should be able to undelete the
file if necessary. If it's been put into my home directory, in a
temporary location which presumably is not world-readable, they
can't. They probably don't even know who delete it.
Jonathan Kamens USnail:
MIT Project Athena 11 Ashford Terrace
jik at Athena.MIT.EDU Allston, MA 02134
Office: 617-253-8085 Home: 617-782-0710
More information about the Comp.unix.admin
mailing list