directory clusters
Johan Vromans
jv at squirrel.mh.nl
Tue Mar 6 00:18:20 AEST 1990
I usually maintain several releases of programs on-line, and find the
following program very useful.
It inspects the contents of two directories, and links files which are
identical. This cuts down on the disk usage without losing
information.
For example: suppose you have two directories called NEW and OLD, with the
following files:
NEW/one OLD/one same name, but different
NEW/two OLD/two same contents
NEW/three OLD/four unique
Then running lndir.pl will remove NEW/two and replace it by a link to
OLD/two. The other files will be untouched.
Multiple directories can be clustered, e.g.:
lndir perl3.0.4 perl3.0.0
lndir perl3.0.6 perl3.0.4
lndir perl3.0.8 perl3.0.6
lndir perl3.0.12 perl3.0.8
etc. Specify the new directory first.
Have fun!
Comments are welcome - send them to jv at mh.nl .
------ start of lndir.pl -- ascii -- complete ------
#!/usr/bin/perl
#
# @(#)lndir.pl 1.4
#
# Compare two directories, and link all files which are identical.
# For a cluster of directories, specify the NEW directory first
#
# It is not recursive (yet).
#
# This program requires Perl version 3.0, patchlevel 8 or later.
# Check args && fetch names.
if ( $#ARGV != $[ + 1
|| (!( -d ($newdir = shift(@ARGV)) && -d ($refdir = shift(@ARGV))))) {
print STDERR "Usage: $0 new-dir ref-dir\n";
exit 1;
}
# Get all files in the new dir. Use readdir since it is faster than
# globbing, but also because we need to include .-files.
opendir (newdir, $newdir) || die "$!: $newdir\nStopped";
@all = readdir (newdir);
closedir (newdir);
$|=1;
for $file ( @all ) {
$ok = 1;
# Get stat info for both files
@stat1 = stat("$newdir/$file");
if ($#stat1 < $[) { #];
printf "$newdir/$file ... $! - skipped\n";
$ok = 0;
}
@stat2 = stat("$refdir/$file");
if ($#stat2 < $[) { #];
printf "$refdir/$file ... $! - skipped\n";
$ok = 0;
}
next unless $ok;
printf "$newdir/$file ... ";
if ( ! -f "$newdir/$file" ) {
printf "not a plain file - skipped\n";
next;
}
if ( ! -f "$refdir/$file" ) {
printf "not in $refdir\n";
next;
}
# Quick check on size, if equal: use cmp
if (($stat1[7] != $stat2[7])
|| system ("cmp -s $newdir/$file $refdir/$file")) {
printf "differ\n";
next;
}
# Already linked? Compare dev/inode numbers
if (($stat1[0] == $stat2[0]) && ($stat1[1] == $stat2[1])) {
printf "identical to $refdir/$file\n";
next;
}
# Okay, let's link
if (unlink ("$newdir/$file") && link ("$refdir/$file", "$newdir/$file")) {
printf "linked to $refdir/$file\n";
next;
}
printf "$!\n";
}
------ end of lndir.pl -- ascii -- complete ------
Johan Vromans jv at mh.nl via internet backbones
Multihouse Automatisering bv uucp: ..!{uunet,hp4nl}!mh.nl!jv
Doesburgweg 7, 2803 PL Gouda, The Netherlands phone/fax: +31 1820 62944/62500
------------------------ "Arms are made for hugging" -------------------------
More information about the Alt.sources
mailing list