An AWK script to check "junk" for newsgroups
Josef Wolf
Sepp at ppcger.ppc.sub.org
Wed Feb 20 06:41:56 AEST 1991
dglo at ADS.COM (Dave Glowacki) writes:
] Since, as a rule, EVERY C or shell program posted must be followed up
] by a PERL script, here's my version of NEWJUNK.
Well. Fine fine. But what about using standard-tools? Which *IX is
_delivered_ with Perl?
Now here is my version of NEWJUNK. It could have been better, but older
versions of gawk have these ugly memory-leak, so you have to sort out
the 'Newsgroups:' lines and pipe it into gawk :-(
The awk-version will most likely be slower than the C-version and the
Perl-Version, but it schould run on most *IX with little modifications.
This version uses 3 config files:
/usr/lib/news/newjunk.active this are the newsgroups, I am interested
/usr/lib/news/newjunk.trash to throw away the entire Newsgroups:-line
/usr/lib/news/newjunk.junk Newsgroups, I don't want
In the config-files you can use regular expressions. Here is my
newjunk.active, for example:
---- snipp ----
# newjunk.active
#
# these are the newsgroups I want to have complete, if they will be
# found in junk
^comp\.sys\..*
^comp\.os\..*
^comp\.mail.*
^dnet\..*
^eunet\..*
^mnet\..*
^ppc\..*
^sub\..*
# I want all.sources.all
.*\.sources.*
# and all.os9.all
.*\.os9\..*
---- snipp ----
Here goes newjunk.awk. Just pipe all '^Newsgroups:' into 'awk -f newjunk.awk'
---- snipp---
BEGIN {
# read in active
FS = ":";
# ^^^ my news-system needs this one
while (getline <"/usr/lib/news/active" > 0)
if (length ($1))
active [activecount++] = $1;
# read in config files
while (getline tmp <"/usr/lib/news/newjunk.active" > 0)
if (length (tmp) && !match (tmp, "^#"))
nactive [nactivecount++] = tmp;
while (getline tmp <"/usr/lib/news/newjunk.trash" > 0)
if (length (tmp) && !match (tmp, "^#"))
trash [trashcount++] = tmp;
while (getline tmp <"/usr/lib/news/newjunk.junk" > 0)
if (length (tmp) && !match (tmp, "^#"))
junk [junkcount++] = tmp;
FS = ",";
# newsgroups are separated with kommas
}
function insert_newsgroup(ng) {
# if newsgroup is alraedy inserted, we can save some time
for (k = 0; k < newcount; k++)
if (ng == newgroups [k])
return;
# skip newsgroup if it is already active
for (k = 0; k < activecount; k++)
if (ng == active [k])
return;
# insert newsgroup
newgroups [newcount++] = ng;
}
// {
# check every newsgroup given in input line
for (j = 1; j <= NF; j++) {
# do we want this newsgroup?
for (i = 0; i < nactivecount; i++) {
if (match ($j, nactive [i])) {
insert_newsgroup($j);
# break;
# don't know why I get some bus-error at this break -- sigh!
# but the script runs without this too (grinn :-)
}
}
# is there any trash-newsgroup?
for (i = 0; i < trashcount; i++)
if (match ($j, trash [i]))
next;
# no trash-groups -> sort out the junk-newsgroups
to_insert_count = 0;
for (i = 0; i < junkcount; i++)
if (!match ($j, junk [i]))
to_insert [to_insert_count++] = $j;
}
# insert them now
for (i = 0; i < to_insert_count; i++)
insert_newsgroup(to_insert [i]);
}
END {
for (i = 0; i < newcount; i++) {
# insert the command for YOUR inews here
cmd = "inews -ad=local '-c=newgroup:" newgroups[i] "' </nil";
system (cmd);
print newgroups [i];
}
}
---- snipp ----
Greetings
Sepp
Disclaimer: I had no time to make much tests of this version of newjunk.awk
If there are bugs, please let me know :-)
| Josef Wolf, Germersheim, Germany | +49 7274 8047 -24 Hours- (call me :-)
| ...!ira.uka.de!smurf!ppcger!sepp | +49 7274 8048 -24 Hours-
| sepp at ppcger.ppc.sub.org | +49 7274 8967 18:00-8:00, Sa + Su 24h
| "is there anybody out there?" | all lines 300/1200/2400 bps 8n1
More information about the Alt.sources
mailing list