two (or more) lex's/yacc's in one executable
Martin Weitzel
martin at mwtech.UUCP
Tue Dec 11 02:49:46 AEST 1990
In article <14674 at smoke.brl.mil> gwyn at smoke.brl.mil (Doug Gwyn) writes:
>In article <1990Dec6.200944.13037 at cs.columbia.edu>, leland at cs writes:
>- I've tried this kludge: create a header file that re-#define's all the
>- names 'yyfoo' in lex/yacc set #1 to be named, say, set1yyfoo, and all
>- those in set #2 to be named set2yyfoo. This has worked for me in the
>- past, but won't in this particular instance because the generated
>- code includes calls to yyless() and yywrap(), which are in the LEX
>- library (-ll), the contents of which I cannot rename. So that doesn't
>- work.
>
>But it almost does -- Since "lex" produces C source, you can #define
>set1yyless yyless, etc. before the lex output to be compiled, thereby
>turning these selected reference back into calls to the shared library
>functions. (I assume the lex library does not maintain internal state.)
Unfortunately things are more complicated. Here is an excerpt from
`nm /usr/lib/libl.a' (UNIX Sys V):
----------------------------------------------------------------------
Symbols from /usr/lib/libl.a[reject.o]:
Name Value Class Type Size Line Section
reject.c | | file | | | |
yyreject | 0|extern| int( )| 270| |.text
yyracc | 272|extern| int( )| 154| |.text
yyinput | 0|extern| | | |
yyleng | 0|extern| | | |
yytext | 0|extern| | | |
yylsp | 0|extern| | | |
yyolsp | 0|extern| | | |
yyfnd | 0|extern| | | |
yyunput | 0|extern| | | |
yylstate | 0|extern| | | |
yyprevious | 0|extern| | | |
yyoutput | 0|extern| | | |
yyextra | 0|extern| | | |
yyback | 0|extern| | | |
Symbols from /usr/lib/libl.a[yyless.o]:
Name Value Class Type Size Line Section
yyless.c | | file | | | |
yyless | 0|extern| int( )| 107| |.text
yyleng | 0|extern| | | |
yytext | 0|extern| | | |
yyunput | 0|extern| | | |
yyprevious | 0|extern| | | |
Symbols from /usr/lib/libl.a[yywrap.o]:
Name Value Class Type Size Line Section
yywrap.c | | file | | | |
yywrap | 0|extern| int( )| 16| |.text
----------------------------------------------------------------------
The problem is not some internal state of these functions, but that they
expect a number of external `yyfoo'-symbols, and there is no way to make
them access the `right' ones without rewriting the functions.
So, how hard would it be to rewrite them?
The trivial case is `yywrap'. I hope AT&T doesn't sue me because of reverse
engineering :-), but this function is a one-liner.
yywrap() { return 1; }
The two other functions (`yyless' and `yywrap') may have complicated
interactions with a lot of globals, so the best solution is to avoid
them and do manually what is required. This is simple in case of
`yyless', since it is usually used to push back parts of `yytext' to
the input stream. This can also be done by with `unput()'-macro in a loop
(The library version of `yyless' does this via the `yyunput()'-function but
this function simply calls `unput()' which may have been redefined -
have a look into `lex.yy.c' to understand how things work together.)
In addition the "original" `yyless' adjusts `yytext' and `yyleng'
accordingly. The part that still worries me is the reference to
`yyprevious' within `yyless'. To be sure, you should probably disassemble
the library version of `yyless' - it's not that large.
``yyreject' should best be completly avoided because it plays with a lot
of external symbols (the poster of the original question is lucky here, but
others may understand this as a hint to use REJECT - which in turn calls
yyreject() - only as a last resort).
BTW: Another option is to have a common lexer for both sets of input
symbols and use start conditions in lex to select the appropriate ones.
It's a pitty that start conditions are insufficiently explained in the
common documentation of lex (if they are mentioned at all).
--
Martin Weitzel, email: martin at mwtech.UUCP, voice: 49-(0)6151-6 56 83
More information about the Comp.lang.c
mailing list