what should egrep '|root' /etc/passwd print?
David Canzi
dmcanzi at watdcsu.waterloo.edu
Mon Sep 19 13:55:58 AEST 1988
In article <8209 at alice.UUCP> andrew at alice.UUCP (Andrew Hume) writes:
>it sounds appealing to allow a missing RE to mean the empty string
>but i am unconvinced as to its utility.
If x, y, and z are regular expressions, then xyz matches those strings
which can be formed by concatenating any three strings X, Y, and Z
where x matches X, y matches Y, and z matches Z. The expression 'x|y'
matches any string that is matched by x or y.
So, suppose y=''. Let x='aa' and z='bb'. Then xyz='aabb'. 'aa' is
the only string x matches, and 'bb' is the only string z matches,
'aabb' is the only string xyz matches. The only thing left for y to
match is the null string between 'aa' and 'bb'. Therefore, the null
string matches the null string.
Let x='' and y='root', so that x|y = '|root'. Then x|y matches the null
string (because it matches x) and the string 'root' (because it matches
y). So the egrep command in the subject line should print out all of
/etc/passwd, since every line has the null string on it.
This is intuitively obvious to me, but I tried to prove it because I'm
not sure other people's intuitions are similar to mine.
As for utility, consider the case, which I have actually run into,
where I wanted an expression like 'aa(|bb)cc' to match the strings
'aacc' and 'aabbcc'. In this case, it's clear I want the expression
in parentheses to match the null string. The program I was using
wouldn't let me do this, and I had to use something like 'a(a|abb)cc'
to get what I wanted. If I had had a program generate that expression,
I would have had to add code to detect this special case and rewrite
the regular expression. Yecch.
--
David Canzi
More information about the Comp.unix.wizards
mailing list