Perl version of from (Was: Re: from.sed (v1.2))
Johan Vromans
jv at mh.nl
Fri Dec 22 16:11:43 AEST 1989
In article <1989Dec20.222732.5633 at trigraph.uucp> john at trigraph.uucp (John Chew) writes:
Here's a new version of from.sed, my sed script that does the job
of from(1) better and faster. It now truncates long subjects,
correctly handles messages without subjects and From lines with %
or @foo: routing.
Yes, I tried writing this in Perl. I'm not an expert Perl programmer,
but I couldn't get it to run faster than about 70% slower than sed.
I've been using a perl version of 'from' for a long time, so I trow it
in. Features:
- shortens the date, so there's more room for subject
- shortens long subjects
- uses "From: " headers if possible
- provide "<none>" subject
- automatic determination of system mailbox
- maybe more
- output sample:
Nov 29 00:14 "jv " Re: your mail through the list got here
Nov 28 21:21 "David Dyck " your mail through the list got here
Nov 29 08:28 "Mark H. Colbu" Re: output compatibility
It runs about as fast as the sed version. Typical times for a large
mailbox (46585 lines) real/user/sys 50/16/8 for sed, 50/22/7 for perl.
------ begin of from -- ascii -- complete ------
#!/usr/bin/perl
# This program requires perl version 3.0, patchlevel 4 or higher
# Usage "from MAILBOX..."
# Don't forget: perl is a Practical Extract and Report Language!
format =
@<<<<<<<<<<< "@<<<<<<<<<<<<" ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<..
$date, $from, $subj
.
if ( $#ARGV < 0 ) {
if ( ! ($user = getlogin)) {
@a = getpwuid($<);
$user = $a[0];
}
if ( -r "/usr/mail/$user" ) {
@ARGV = ("/usr/mail/$user");
}
elsif ( -r "/usr/spool/mail" ) {
@ARGV = ("/usr/spool/mail/$user");
}
else {
printf STDERR "No mail for $user.\n";
exit 1;
}
}
# read through input file(s)
while ( $line = <> ) {
chop ($line);
# scan until "From_" header found
next unless $line =~ /^From\s+(\S+)\s+.*(\w{3}\s+\d+\s+\d+:\d+)/;
$from = $1;
$date = $2;
if ( $date eq "" || $from eq "" ) {
print STDERR "Possible garbage: $line\n";
next;
}
# get user name from uucp path
$from = $1 if $from =~ /.*!(.+)/;
# now, scan for Subject or empty line
$subj = "";
while ( $line = <> ) {
chop ($line);
if ( $line =~ /^$/ || $line =~ /^From / ) {
# force fall-though
$subj = "<none>" unless $subj;
}
else {
$subj = $1 if $line =~ /^Subject\s*:\s*(.*)/i;
if ( $line =~ /^From\s*:\s*/ ) {
$line = $';
if ( $line =~ /\((.+)\)/i ) { $from = $1; }
elsif ( $line =~ /^\s*(.+)\s*<.+>/i ) { $from = $1; }
elsif ( $line =~ /^<.+>\s*(.+)/i ) { $from = $1; }
}
}
# do we have enough info?
if ( $from && $subj ) {
write;
last;
}
}
}
------ end of from -- ascii -- complete ------
Have fun,
Johan
--
Johan Vromans jv at mh.nl via internet backbones
Multihouse Automatisering bv uucp: ..!{uunet,hp4nl}!mh.nl!jv
Doesburgweg 7, 2803 PL Gouda, The Netherlands phone/fax: +31 1820 62944/62500
------------------------ "Arms are made for hugging" -------------------------
More information about the Alt.sources
mailing list