Stupid awk question
Mitchell Wyle
wyle at inf.ethz.ch
Thu Oct 12 03:00:11 AEST 1989
In article <DMAUSTIN.89Oct10145918 at vivid.sun.com>
dmaustin at vivid.sun.com (Darren Austin) writes:
>I am trying to split up a large output file into several
>smaller files. The smaller files are named after the value in
>the second field. I tried using the following simple awk script,
>
>(current == $2) {print > current".summary"}
>(current != $2) {close(current".summary");
> current=$2;print > current".summary";}
>
>but it fails with
>
>awk: too many output files 10
Even though everyone will soon have new awk and all these old awk problems
will go away, I think this question deserves to be in the "Frequently
asked questions and answers" periodic postings. Who moderates it? How
should one post to it?
* * *
To answer the question, I shall quote verbatum an old article.
>>I am trying to use AWK to split one file into many formatted, smaller files.
>>The problem I am having is that I cannot output to more than 10 files...
>
> Well, it won't help you right now, but the long-term fix is to complain
> to your software supplier and ask them to get rid of the silly limit.
> It's not that hard.
The limits are based on the number of file descriptors that can be open
at one time (usually small). One way that I often get around this is
by writting something like this which splits up the input on the field
$1 .
sort +0 |
awk '
{
if (last != $1) {
if (NR > 0) print "!XYZZY";
print "cat > " $1 "<<!XYZZY";
last = $1;
}
print;
}
END { if (NR > 0) print "!XYZZY"; }' | /bin/sh
Tony O'Hagan tonyo at qitfit.qitcs.oz
* * *
I use Tony's solution all the time. I have seen it used by at least
two other people (David Goodenough and Amos Shapiro) in shell scripts
posted to the net.
It is very important to put that trailing End_of_Here_Document string
in the END clause of your awk program! Depending on the complexity of
your parse, you might need other cleanup code there as well.
Happy hacking,
-Mitchell F. Wyle
Institut fuer Informationssysteme wyle at inf.ethz.ch
ETH Zentrum / 8092 Zurich, Switzerland +41 1 256 5237
--
If this appears in _IN_MODERATION_ or ClariNet, please let me know.
I am forbidden to tell you that you can reach me at:
...!uunet!mcvax!ethz!wyle or wyle at rascal.ics.utexas.edu
More information about the Comp.unix.questions
mailing list