Non Destructive Version of rm

The Grand Master asg at sage.cc.purdue.edu
Tue May 7 02:58:26 AEST 1991


In article <1991May6.072447.21943 at casbah.acns.nwu.edu> navarra at casbah.acns.nwu.edu (John 'tms' Navarra) writes:
}In article <JIK.91May6001507 at pit-manager.mit.edu> jik at athena.mit.edu (Jonathan I. Kamens) writes:
}>
[First a brief history]
}>  John Navarra suggests a non-destructive version of 'rm' that either
}>moves the deleted file into a directory such as
}>/var/preserve/username, which is periodically reaped by the system, or
}>uses a directory $HOME/tmp and does a similar thing.
}>
}>  He points out two drawbacks with the approach of putting the deleted
}>file in the same directory as before it was deleted.  First of all,
}>this requires that the entire directory tree be searched in order to
}>reap deleted files, and this is slower than just having to search one
}>directory.  Second, the files show up when the "-a" or "A" flag to ls
}>is used to list the files in a directory.
}>
}>  A design similar to his was considered when we set about designing
}>the non-destructive rm currently in use (as "delete") at Project Athena
}>1. In a distributed computing environment, it is not practical to
}>   assume that a world-writeable directory such as /var/preserve will
}>   exist on all workstations, and be accessible identically from all
}>   workstations (i.e. if I delete a file on one workstation, I must be
}>   able to undelete it on any other workstation; one of the tenet's of
}>   Project Athena's services is that, as much as possible, they must
}>   not differ when a user moves from one workstation to another).

Explain something to me Jon - first you say that /var/preserve will not
exist on all workstations, then you say you want a non-differing 
environment on all workstations. If so, /var/preserve SHOULD 
exist on all workstations if it exists on any. Maybe you should make
sure it does.
     
}>   Furthermore, the "delete" program cannot run setuid in order to
}>   have access to the directory, both because setuid programs are a
}>   bad idea in general, and because setuid has problems in remote
}>   filesystem environments (such as Athena's).  Using $HOME/tmp
}>   alleviates this problem, but there are others....

Doesn't need to run suid. Try this:
$ ls -ld /var/preserve
rwxrwxrwt preserve preserve    /var/preserve
$ ls -l /var/preserve
rwx------ navarra  navarra     /var/preserve/navara
rwx------ jik      jik         /var/preserve/jik

hmm, doesn't look like you need anything suid for that!
}
}     	The fact that among Athena's 'tenets' is that of similarity from
} workstation to workstation is both good and bad in my opinion. True, it
} is reasonable to expect that Unix will behave the same on similar workstations
} but one of the fundamental benifits of Unix is that the user gets to create
} his own environment. Thus, we can argue the advantages and disadvantages of
} using an undelete utililty but you seem to be of the opinion that non-
} standard changes are not beneficial and I argue that most users don't use
} a large number of different workstations and that we shouldn't reject a 
} better method just because it isn't standard.

 
It is bad in no way at all. It is reasonable for me to expect that my 
personaly environment, and the shared system environment will be the
same on different workstations. And many users at a university sight
use several different workstations (I do). I like to know that i can
do things the same way no matter where I am when I log in.

}>2. (This is a big one.) We wanted to insure that the interface for
}>   delete would be as close as possible to that of rm, including
}>   recursive deletion and other stuff like that.  Furthermore, we
}>   wanted to insure that undelete's interface would be close to
}>   delete's and as functional.  If I do "delete -r" on a directory
}>   tree, then "undelete -r" on that same filename should restore it,
}>   as it was, in its original location.

Therre is not a large problem with this either. Info could be added to
the file, or a small record book could be kept. And /userf/jik could
be converted to $HOME in the process to avoid problems with diferent
mount points.
}>
}>   Navarra's scheme cannot do that -- his script stores no information
}>   about where files lived originally, so users must undelete files by
}>   hand.  If he were to attempt to modify it to store such
}>   information, he would have to either (a) copy entire directory
}>   trees to other locations in order to store their directory tree
What about $HOME/tmp???? - Then you would have only to mv it.
}>   state, or (b) munge the filenames in the deleted file directory in
}>   order to indicate their original locationa, and search for
}>   appropriate patterns in filenames when undeleting, or (c) keep a
}>   record file in the deleted file directory of where all the files
}>   came from.
Again - these last two are no problem at all.
} 
}    Ahh, we can improve that. I can write a program called undelete that
}    will look at the filename argument and by default undelete it to $HOME
}    but can also include a second argument -- a directory -- to move the
}    undeleted material. I am pretty sure I could (or some better programmer
}    than I) could get it to move more than one file at a time or even be
}    able to do something like: undelete *.c $HOME/src and move all files
}    in /var/preserve/username with .c extensions to your src dir.
}    And if you don't have an src dir -- it will make one for you. Now this
}    if done right, shouldn't take much longer than removing a directory 
}    structure. So rm *.c on a dir should be only a tiny bit faster than 
}    undelete *.c $HOME/src. I think the wait is worth it though -- esp
}    if you consider the consequnces of looking thru a tape backup or gee
}    a total loss of your files!  

This is not what Jon wants though. He does not want the user to have to
remember where in the directory tree the file was undeleted from. 
	However, what Jon fails to point out is that one must remember
where they deleted a file from with his method too. Say for example I do
the following.
$ cd $HOME/src/zsh2.00/man
$ delete zsh.1
 Now later, when I want to retrieve zsh.1 - I MUST CHANGE DIRECTORIES
to $HOME/src/zsh2.00/man. I STILL HAVE TO REMEMBER WHAT DIRECTORY I 
DELETED THE FILE FROM!!!! So you gain NOTHING by keeping the file in 
the directory it was deleted from. Or does your undelete program also
search the entire damn directory structure of the system?

}	As far as rm -r and undelete -r go, perhaps the best way to handle
}    this is when the -r option is called, the whole dir in which you are 
}    removing files is just moved to /preserve. And then an undelete -r dir          dir2 where dir2 is a destination dir,  would restore all those files.             HOwever, you would run into
}    problems if /preserve is not mounted on the same tree as the dir you wanted

Again, that is why you should use $HOME/tmp.

}>3. If all of the deleted files are kept in one directory, the
}>   directory gets very large.  This makes searching it slower, and
}>   wastes space (since the directory will not shrink when the files
}>   are reaped from it or undeleted).

This is much better than letteng EVERY DAMN DIRECTORY ON THE SYSTEM
GET LARGER THAN IT NEEDS TO BE!!

Say I do this
$ ls -las
14055 -rw-------   1 wines    14334432 May  6 11:31 file12.dat
21433 -rw-------   1 wines    21860172 May  6 09:09 file14.dat
$ rm file*.dat
$ cp ~/new_data/file*.dat .
[ note at this point, my directory will probably grow to a bigger
size since therre is now a fill 70 Meg in one directory as opposed
to the 35 meg that should be there using John Navarra's method]
[work deleted]
$ rm file*.dat
(hmm, I want that older file12 back - BUT I CANNOT GET IT!)
}
}   You get a two day grace period -- then they are GONE! This is still faster
} than searchin thru the current directory (in many cases) looking for .# files
} to undelete. 

You are correct sir.
}>
}>4. My home directory is mounted automatically under /mit/jik.  but
}>   someone else may choose to mount it on /mnt, or I may choose to do
}>   so.  The undeletion process must be independent of mount point, and
}>   therefore storing original paths of filenames when deleting them
}>   will fail if a different mount point is later used.  Using the
}>   filesystem hierarchy itself is the only way to insure mount-point
}>   independent operation of the system.
Well most of us try not to go mounting filesystems all over the place.
Who would be mounting your home dir on /mnt?? AND WHY???
}>
}     if that is the case -- fine -- you got me there. Do it from crontab
} and remove them every few days. I just think it is a waste to infest many       directories with *~ and # and .# files when 99% of the time when someone
} does rm filename -- THEY WANT IT REMOVED AND NEVER WANT TO SEE IT AGAIN!
} SO now when I do an ls -las -- guess what! There they are again! Well

John, how about trying (you use bash right?) ;-)
bash$ ls() {
> command ls $@ | grep -v \.\#
> }

} you tell me "John, don't do an ls -las"   -- well how bout having
} to wait longer on various ls's because my directory size is bigger now. 
This point is still valid however, because there will be overhead
associated with piping billions of files starting with .# through grep -v
(as well as the billions of files NOT starting with .# that must be piped
through)
}>
}>6. If I delete a file in our source tree, why should the deleted
}>   version take up space in my home directory, rather than in the
}>   source tree?  Furthermore, if the source tree is on a different
}>   filesystem, the file can't simply be rename()d to put it into my
}>   deleted file directory, it has to be copied.  That's slow.  Again,
}>   using the filesystem hierarchy avoids these problems, since
}>   rename() within a directory always works (although I believe
}>   renaming a non-empty directory might fail on some systems, they
}>   deserve to have their vendors shot :-).

Is this system source code? If so, I really don't think you should be 
deleting it with your own account. But if that is what you wish, how about
a test for if you are in your own directory. If yes, it moves the
deleted file to $HOME/tmp, if not, it moves it to ./tmp (or ./delete, or
./wastebasket or whatever)
}>
}>7. Similarly, if I delete a file in a project source tree that many
}>   people work on, then other people should be able to undelete the
}>   file if necessary.  If it's been put into my home directory, in a
}>   temporary location which presumably is not world-readable, they
}>   can't.  They probably don't even know who delete it.

Shouldn't need to be world readable (that is assuming that to have
permission to delete source you have to be in a special group - or
can just anyone on your system delete source?)
}
}    I admit you have pointed out some flaws. Some of which can be corrected,
} others you just have to live with. I have made a few suggestions to improve
} the program. In the end though, I think the one /preserve directory is
} much better. But here is another suggestion which you might like:
}
}>Jonathan Kamens			              USnail:


Well Jon, I have a better solution for you - ready?
rm:
# Safe rm script
at -o 2.0.0 rm $*

That seems to be what you want. 

Look - there is no perfect method for doing this. But the best way seems
to me to be the following
1) move files in the $HOME tree to $HOME/tmp
2) Totally delete files in /tmp
3) copy personally owned files from anywhere other than $HOME or /tmp
   to $HOME/tmp (with a -r if necessary). Do this in the background.
   Then remove them of course  (cp -r $dir $HOME/tmp ; rm -r $dir) &
4) If a non-personally owned file is deleted, place it in ./delete, 
   and place a notification in file as to who deleted it when. Then spawn
   an at job to delete the file in 2 days, and the notification in whatever
   number of days you wish.
 an example of 4:
jik> ls -las
drwxrwxr-x    source   source    1024  .
-rwxrwxr-x    source   source    5935 fun.c
jik> rm fun.c
jik> ls -las
drwxrwxr-x    source   source    1024  .
drwxrwxr-x    source   source    1024  .delete
-rwxrwxr-x    source   source      69  fun.c
jik> cat fun.c
File: fun.c
Deleted at: Mon May  6 12:41:31 EDT 1991
Deleted by: jik

Another possibility for 4:
I assume that the source tree is all one filesystem no? If so then
have filse removed in the source tree moved to /src/.delete. Have a
notification then placed in fun.c and spawn an at job to delete it, or
place the notification in fun.c_delete and have the src tree searched
for *_delete files (or whatever you wanna call them).

}From the Lab of the MaD ScIenTiST:
}      
}navarra at casbah.acns.nwu.edu

Have fun. 
Oh and by the way - I think doing this with a shell script is a complete
waste of resources. You could easily make mods to th eacual code to
rm to do this, or use the PUCC entombing library and not even have to 
change the code to rm (just have to link to the aforementioned PUCC entombing
library when compiling rm).
culater
			Bruce
			  Varney

---------
                                   ###             ##
Courtesy of Bruce Varney           ###               #
aka -> The Grand Master                               #
asg at sage.cc.purdue.edu             ###    #####       #
PUCC                               ###                #
;-)                                 #                #
;'>                                #               ##



More information about the Comp.unix.admin mailing list