Regular expression question.
Tom Poage
poage at sunny.UUCP
Fri Feb 24 04:06:14 AEST 1989
Is there a reason why I don't find regular expressions
with both alternation and explicit number of occurrence
declaration? Here's what I mean ...
In some public-domain regexp routines I can use
(string1|string2)
In other routines I can use
(something){3,4}
However, I have never seen routines with the ability to use
these two constructs together, such as
(x|y|(z){4,5})
For example, I want to find strings of 9 digits occurring
in a certain pattern, similar to:
875000000-876000000,786992210,>789922119
The current (gnu) regexp routine I have requires the following to
match the above line. The actual line has been split for
demonstration purposes.
^((([<>](=)?)?[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9])|
([0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]-
[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]))
(,(([<>](=)?)?[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9])|
([0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]-
[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]))*$
The first problem is that this regexp overflows grep/egrep of
SunOS 3.5 (However Gnu's e?grep handles it just fine). The
second is that this is unwieldy. The third is that I don't
necessarily want to parse the line into fragments and perform
sub matches.
Why can't I do something like this (still split)?
^((([<>](=)?)?[0-9]{9})|([0-9]{9}-[0-9]{9}))(,(([<>](=)?)?[0-9]{9})|
([0-9]{9}-[0-9]{9}))*$
Don't you agree this is easier ":-):-):-)" to read?
Is this only a difference between System V and BSD variants?
Is there a public-domain version of regexp(3) with these
features merged? I await with bated breath. Tom.
--
Tom Poage, UCDMC Clinical Engineering, Sacto., CA
poage at sunny.ucdavis.edu
...!ucbvax!ucdavis!sunny!poage
More information about the Comp.unix.questions
mailing list