Pattern matching with awk
Lou Kates
louk at tslwat.UUCP
Wed Mar 6 12:55:10 AEST 1991
In article <1991Mar04.051048.5864 at convex.com> tchrist at convex.COM (Tom Christiansen) writes:
>From the keyboard of lin at CS.WMICH.EDU (Lite Lin):
>: I'm trying to identify all the email addresses in email messages, i.e.,
>:patterns with the format user at node. Now I can use grep/sed/awk to find
>:those lines containing user at node, but I can't figure out from the manual
>:how or whether I can have access to the matching pattern (it can be
>:anywhere in the line, and it doesn't have to be surrounded by spaces,
>:i.e., it's not necessarily a separate "field" in awk). If there is no
>:way to do that in awk, I guess I'll do it with lex (yytext holds the
>:matching pattern).
>
>Well, I wouldn't try to do it in awk, but that doesn't mean we have to
>jump all the way to a C program!
>
> perl -ne 's/([-.\w]+@[-.\w]+)/print "$1\n"/ge;'
The following awk program looks for expressions of the form
word at word where word contains only letters, numbers and dots and
the field separator is anything except letters, numbers, dots and
@. You can change the regular expressions in order to vary the
effect:
BEGIN { FS = "[^.a-zA-Z0-9@]+";
word = "[.a-zA-Z0-9]+";
addr = "^" word "@" word "$"
}
{ for(i=1; i<=NF; i++) if ($i ~ addr) print $i }
Lou Kates, Teleride Sage Ltd., louk%tslwat at watmath.waterloo.edu
More information about the Comp.unix.questions
mailing list