lex/yacc questions from a novice...
Jeffrey W Percival
jwp at larry.sal.wisc.edu
Wed Aug 23 02:41:14 AEST 1989
I am trying to use lex and yacc to help me read a dense, long,
machine-produced listing of some crappy "special purpose" computer
language. I have a listing of the "rules" (grammar?) governing
the format of each line in the listing.
I believe lex and yacc are the right tools, because the set of rules
I have seem to match the spirit of the examples I read in the lex and yacc
papers by Lesk and Schmidt (Lex) and Johnson (yacc). For example:
digit: [0-9]
integer: {DIGIT}+
and so on to the more complicated
command definition: {command introducer} {statement}+ {command terminator}
My first question is how one trades off work between lex and yacc.
Should lex do more than just return characters? There are all sorts of
keywords in my language that a lexical analyzer could recognize, and
just return tokens for them.
Along these lines, a problem I am having is getting the message "too
many definitions" from lex, when all I have are a few keywords and
ancillary definitions: (lex file included below for illustration). Is
lex truly this limited in the number of definitions? Can I increase
this limit? Or am I using lex for too much, and not using yacc for
enough?
SMSHDR "SMSHDR"
ENDSMS "ENDSMS"
CP224 "CP224"
GROUP "GROUP"
PRT "PRT"
RTS "RTS"
SAFING "SAFING"
BEGINDATA "BEGINDATA"
ENDDATA "ENDDATA"
_IF "_IF"
_ELSE "_ELSE"
_ENDIF "_ENDIF"
_MESSAGE "_MESSAGE"
_SET "_SET"
_DELETE "_DELETE"
INCLUDE "INCLUDE"
LETTER [A-Za-z]
DIGIT [0-9]
HEX_DIGIT [0-9A-F]
OCT_DIGIT [0-7]
BIN_DIGIT [0-1]
SPECIAL [_%#@]
STRING ({DIGIT}|{LETTER}|{SPECIAL})+
WORD {LETTER}({DIGIT}|{LETTER}|{SPECIAL})*
OCT_MNEMONIC ("_"{STRING})|({WORD})
LABEL {STRING}":"
LABEL_REF "'"{STRING}"'"
TEXT_STRING "'"[ -~]"'"
HEX_INT '{HEX_DIGIT}+'X
OCT_INT '{OCT_DIGIT}+'O
BIN_INT '{BIN_DIGIT}+'B
U_INT {DIGIT}+
S_INT [+-]?{U_INT}
U_REAL {U_INT}"."{U_INT}
S_REAL [+-]?{U_REAL}
FLOAT ({S_REAL}|{S_INT})([ED]{S_INT})?
YY {U_INT}"Y"
DD {U_INT}"D"
HH {U_INT}"H"
MM {U_INT}"M"
SS ({U_INT}|{U_REAL})"S"
REL_TIME [+-]?(({HH})?({MM})?({SS}))|(({HH})?({MM})({SS})?)|(({HH})({MM})?({SS})?)
UTC_TIME {YY}?{DD}{REL_TIME}
DEL_TIME ({U_INT}C)|({REL_TIME})
ORB_REL_TIME "ORB,"{U_INT}","{WORD}(","[+-]?{REL_TIME})?
ORB_TIME "("{ORB_REL_TIME}")"
MFS_TIME "("({UTC_TIME}|{ORB_REL_TIME})",MFSYNC"(","[+-]?{REL_TIME})?")"
SOI_OFFSET [+-](({HEX_DIGIT}+"%X")|({U_INT})|({OCT_DIGIT}+"%O"))
SOI "'"{WORD}({SOI_OFFSET})?"'"[ND]
EOL "\n"
%%
--
Jeff Percival (jwp at larry.sal.wisc.edu)
More information about the Comp.unix.questions
mailing list