Stripping "hard returns" from UNIX mail files
Larry Wall
lwall at jpl-devvax.JPL.NASA.GOV
Tue Oct 30 08:40:31 AEST 1990
In article <657182670.22483 at ontmoh.UUCP> peter at ontmoh.UUCP (Peter Renzland) writes:
: patrick at casbs.Stanford.EDU (Patrick Goebel) asks for a UNIX utility
: to remove "hard returns" from mail messages for subsequent processing
: by MS-DOS wordprocessors.
:
: Unix considers it natural for text to be made up of lines, and all
: programs that do useful things with text assume that such lines are
: within some reasonable limit.
Painting with a broad brush here, aren't you? Both Gnu emacs and Perl
agree that the only "reasonable limit" on line length is the amount of
swap space available on your machine.
: This corresponds to things that naturally
: contain lines (text in books or on your display, or on typewriter, or
: a line printer), and those things, naturally, have limits on the line
: length.
:
: The RETURN key, and its code, is an implementation of the typewriter's
: "carriage" return.
Fair enough. But someday we have to escape the typewriter/punchcard metaphor.
Word processors are just beginning to get us out of this straitjacket.
: Text which is thus made up of lines can easily be formatted in all sorts
: of ways. But, if we format it so that we have (limitless) multiline
: paragraphs and no longer any line separators, some of our programs that
: are so handy with lines of text may break in the face of possibly huge
: paragraphs.
So rewrite the programs so they aren't busted.
: Having said that, you could try something like this little program:
:
: awk '
: NF==0 { if(LINE) { print LINE ; LINE="" } ; print ; next}
: { if(LINE) LINE=LINE " " $0 ; else LINE=LINE $0 }
: END { print LINE }
: ' $*
I think gawk will now handle "infinite" lines, but older awks will blow
up on longer paragraphs. It would be a tad nicer if it threw in an
extra space after lines that end a sentence.
: I would prefer to use the PC wordprocessor's text import facilities to
: take standard line-oriented text and convert it to its own paragraph
: format.
Some of use don't have such a clever importer. Yeah, I know, rewrite the
programs...
Sigh.
Larry
More information about the Comp.unix.questions
mailing list