LEX
Tom Stockfisch
ix426 at sdcc6.ucsd.EDU
Thu Feb 4 14:58:23 AEST 1988
In article <260 at nyit.UUCP> michael at nyit.UUCP (Michael Gwilliam) writes:
>
>.... When I
>was writting the tokenizer using LEX and I got intrigued by a little
>problem. Is it possible to write a regular expression that will
>transform a /* comment */ into nothing? ....
>So my question is, to all you experienced lex
>users and compiler writers, can this be done? Or do I need to
>use input() and other lex functions.
[sorry for not emailing, I can't seem to get mail to Michael]
I can't believe how hard this task is in regular expressions, when it is
trivial to code by hand. I have found a solution which I think is correct,
but it took several tries (see end of this posting).
To convince yourself that a pattern is correct, I think you have to show
two things
1. That the body between the "/*" and "*/" cannot possibly contain
a "*/",
2. That the body can contain any other sequence of characters.
If you come up with your own solution, be sure it works properly on the
following input.
1. /*****//hello world */
2. /* hello /* /* world */
3. /* */ hello /* */
4. /**// /* this input should produce "/ \n" for output */
5. /* */ hello */
The following lex source should "elide" all legal comments, and pass all
the rest thru to stdout. As requested, it does not use input().
--cut----
okslash ([^*/]"/"+)
%%
"/*""/"*([^/]|{okslash})*"*/" ;
--cut----
Compile using
lex comment.l; cc lex.yy.c -ll
--
|| Tom Stockfisch, UCSD Chemistry tps at chem.ucsd.edu
More information about the Comp.lang.c
mailing list