Puzzled by A Regexp...
Kris Stephens [Hail Eris!]
krs at uts.amdahl.com
Thu Mar 7 08:58:30 AEST 1991
In article <10469 at ncar.ucar.edu> tres at virga.rap.ucar.edu (Tres Hofmeister) writes:
>
> I've run across a regular expression that I don't quite understand.
>Not that this hasn't happened before, but this seems like it should be
>fairly straightforward...
>
> I'm trying to match entries in /etc/group which have one or more
>members. The following works just fine, matching each of the colon
>delimited fields individually followed by one or more characters:
>
> grep '^.*:.*:.*:..*' /etc/group
This one will find any line with three or more colons with a character
of any type after colon-number-three-or-higher. This re means
From start of line
zero or more of any characters
a colon
zero or more of any characters
a colon
zero or more of any characters
a colon
any single character
zero or more of any characters
It'll match good group entries and
:::::
:::.: ::: --:
::::
a:::a
:::a
> What I don't understand is why the following doesn't work the same
>way:
>
> grep '^.*:..*' /etc/group
This one will find any record that includes a : before the last char in
the line. The re means
From start of line,
zero or more of any characters
a colon
any single character
0 or more of any characters
It matches the following
::
:::a
:b:::
:b
gigo:1123
> It grabs entries with one or more members, true, but also grabs
>entries with no members, e.g. "news:*:6:". I figured that this regexp
>would match the longest possible string at the beginning of a line,
>terminated by a colon, which in the group file should include the first
>two colons, followed by at least one character. It seems to be doing
>something else, given that it will also match a line with no members.
The only lines it *won't* match are those with no colons or where the
only colon in the line is the last character. What it's looking for
is a line with a colon followed by a character.
> Any ideas?
Instead of .* in there, on the first (field matching) version:
grep '^[^:]*:[^:]*:[^:]*:..*' /etc/group
Even better for the second example is to anchor at the END instead of the
BEGINNING of the data lines:
grep ':[^:]+$' /etc/group
will match any line with at least one non-colon character following the
last colon in the line. Alternatives that are the same:
:[^:]\{1,\}$
:[^:][^:]*$
^.*:[^:][^:]*$
Finally, any line not matching the following is either a group with no
members or a badly-formed line in the file
^[^:]+:[^:]*:[0-9]+:[^:]+$
which matches
From start of line
at least one non-colon
a colon
any number of non-colons
a colon
a decimal number
a colon
at least one non-colon
end of line
Note that it won't see other anomolies like a group with too big a gid
(system dependent and we can't check to see if it's 65536, for instance,
if 65535 is the biggest) or usernames that are too long or weird stuff
in the userids field (we could exclude spaces, for instance, by testing
in each case [^: ] instead of [^:] ), but any line *not* found
by the above is either a group with no members or a badly formed line.
...Kris
--
Kristopher Stephens, | (408-746-6047) | krs at uts.amdahl.com | KC6DFS
Amdahl Corporation | | |
[The opinions expressed above are mine, solely, and do not ]
[necessarily reflect the opinions or policies of Amdahl Corp. ]
More information about the Comp.unix.shell
mailing list