LEX behaviour when given "large" automata.
Chris Torek
chris at mimsy.UUCP
Sun Mar 20 15:30:17 AEST 1988
Several people suggested replacing lex description such as
if return (IF);
then return (THEN);
else return (ELSE);
{id} { symenter(yytext); return (ID); }
with
{id} { /* first try it as a keyword */
k = keyword_index(yytext); if (k >= 0) return (k);
/* not a keyword, must be an identifier */
symenter(yytext); return (ID); }
This may (probably does) help in lex, but it is clearly the wrong way
around in almost all cases. The lexer has to traverse a DFA to decide
that something is an ID. While it is at it it could easily tell
whether it is a keyword instead. Obviously this makes the DFA table
larger, but the result should run faster.
It turns out (as Van Jacobson discovered) that lex uses a long
subroutine---on the order of 200 lines---to do what is essentially
described by the loop
/* follow the DFA */
for (state = begin; state[c].v == c; c = *in++)
state += state[c].n;
(best case) or
/*
* Pick the right start state depending on whether we are at
* the beginning of a line.
*/
state = newline ? hat_begin : begin;
while (state[c].v == c) {
/* follow the DFA */
state += state[c].n;
c = *in++;
/* remember action state (for right context) */
if (state->v) {
a_s = state;
a_c = in;
}
}
in = a_c;
if (a_s->n) {
in -= a_s->n; /* back up over right context */
c = in[-1];
}
newline = in[-2]=='\n'; /* remember whether we are now at ^ */
in[-1] = '\0'; /* kill residual */
(worst case). This pretty much handles everything except YYREJECT.
I do not know whether the LBL `flex' (Fast LEX) handles YYREJECT now.
--
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain: chris at mimsy.umd.edu Path: uunet!mimsy!chris
--
Send compilers articles to ima!compilers or, in a pinch, to Levine at YALE.EDU
Plausible paths are { ihnp4 | decvax | cbosgd | harvard | yale | bbn}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers. Meta-mail to ima!compilers-request
More information about the Comp.unix.questions
mailing list