Eliminating Duplicate Mail Headers
Tom Christiansen
tchrist at convex.COM
Thu May 2 09:47:39 AEST 1991
>From the keyboard of lyndon at cs.athabascau.ca (Lyndon Nerenberg):
:[ Tried mailing this but oss670.uucp was unknown to us ]
right, me too.
:In comp.mail.headers you write:
:
:>I'm not able to fix the mailer myself, but can pass its output
:>through standard filters--awk, sed, etc.--before it goes
:>out the door. My first thought was to pass things through 'uniq',
:>but this would also delete consecutive identical lines in the body (the
:>mailer doesn't distinguish between header and body). The probability
:>of consecutive, identical lines in the body of mail messages seems
:>low, but not low enough to chance this.
:
:You almost answered your own question :-)
:
:Use sed to split the headers and body into seperate files. Run the header
:file through sort|uniq, then append the body file. Note that you will
:have to deal with header continuation lines somehow. A short piece of
:C code should handle folding the headers, and unfolding them when you're
:done.
That's a lot of work!!
:Perhaps the easiest way to deal with this would be to write the entire
:filter in C. All you need to do is maintain a linked list of headers
:you have seen. During the scanning phase, if you encounter a header that's
:already on the linked list, ignore it (and any possible continuation
:lines). If it's a new header, start up a second linked list of lines
:containing the header contents. If there are continuation lines in the
:header, simply append them to the linked list for that header. This
:eliminates the need to fold/spindle/mutilate the header continuation
:lines.
:Once you've fallen out of the headers, just copy the message body
:through and you're done!
That's a HELLUVA lotta work!
Here's an awk solution:
#!/bin/awk -f
/^$/ { body = 1 }
{
if (!body) {
if (lastline == $0) next
lastline = $0
}
print
}
And here's a perl solution:
perl -ne 'print if (/^$/ .. eof) || $lastline ne $_; $lastline = $_'
If you want solutions for non-consecutive or especially multi-line
headers, ask, but I can lay odds they'll be in perl. :-)
--tom
More information about the Comp.unix.questions
mailing list