sed script to combine blank lines?

Mon Oct 17 05:23:59 AEST 1988

In article <136 at nascom.UUCP> rar at nascom.UUCP (Alan Ramacher) writes:
|In article <192 at vlsi.ll.mit.edu>, young at vlsi.ll.mit.edu (George Young) writes:
|> Is there a 'sed' wizard out there?  I often want to take a big ascii file
|> (like a .c file after cc -E) and collapse each group of 'blank' lines
|> into exactly one blank line.  'Blank' here is any combination of blanks,
|> tabs and maybe ^L's.  It looks from the documentation that sed should do this
|> quite neatly, using the multiple line pattern space commands with imbedded
|> newlines, but I sure can't figure out how.  I'd prefer the resulting blank
|> line to be just a newline.
|
|sed is not powerful enuf for the job, but a simple awk script will
|work. If you have difficulties writting it, let me know and I will
|supply one. Good luck.

I already mailed George a solution, but couldn't leave this one alone...
Sed is most certainly powerful enough - I'll show you in a minute - ;
in fact, I think for such a typical text processing job sed is to be
preferred.  And a not too unimportant reason for that is its speed.

And here's my sed-solution; note that tab and formfeed have been coded
as ^I and ^L so your pager isn't fooled; you should of course use the
control codes in real.

(using /bin/sh as command interpreter: )

sed -n -e '
/^[ ^I^L]*$/{
    s/^.*$//p
    : again
    n
    s/^[ ^I^L]*$//
    t again
}
p' your_file

Explanation: whenever you read a line containing only blank characters
(i.e. satisfying the first pattern), print just one newline. Discard
any blank lines that follow (the 'again' loop). When you're through
with the 'first pattern subroutine 8-)' print the non-blank line that's
now in the pattern space. Simple enough, huh ?

                                            Leo.

P.S. I don't doubt it can be done with awk (it could even be programmed
with the shell). I however doubt it will be nearly as fast as the sed
solution.