?? tply.tex
字號:
expr : expr '+' expr
| expr '*' expr
| '(' expr ')'
| NUM
;
\end{verbatim}\end{quote}
Without the precedence definitions, this is an ambigious grammar causing
a number of shift/reduce conflicts. The precedence definitions are used
to correctly resolve these conflicts (conflicts resolved using precedence
will not be reported by TP Yacc).
Each precedence definition introduces a new precedence level (lowest
precedence first) and specifies whether the corresponding operators
should be left-, right- or nonassociative (nonassociative operators
cannot be combined at all; example: relational operators in Pascal).
TP Yacc uses precedence information to resolve shift/reduce conflicts as
follows. Precedences are associated with each terminal occuring in a
precedence definition. Furthermore, each grammar rule is given the
precedence of its rightmost terminal (this default choice can be
overwritten using a \verb"%prec" tag; see below). To resolve a shift/reduce
conflict using precedence, both the symbol and the rule involved must
have been assigned precedences. TP Yacc then chooses the parse action
as follows:
\begin{itemize}
\item
If the symbol has higher precedence than the rule: shift.
\item
If the rule has higher precedence than the symbol: reduce.
\item
If symbol and rule have the same precedence, the associativity of the
symbol determines the parse action: if the symbol is left-associative:
reduce; if the symbol is right-associative: shift; if the symbol is
non-associative: error.
\end{itemize}
To give you an idea of how this works, let us consider our ambigious
arithmetic expression grammar (without precedences):
\begin{quote}\begin{verbatim}
%token NUM
%%
expr : expr '+' expr
| expr '*' expr
| '(' expr ')'
| NUM
;
\end{verbatim}\end{quote}
This grammar generates four shift/reduce conflicts. The description
of state 8 reads as follows:
\begin{quote}\begin{verbatim}
state 8:
*** conflicts:
shift 4, reduce 1 on '*'
shift 5, reduce 1 on '+'
expr : expr '+' expr _ (1)
expr : expr _ '+' expr
expr : expr _ '*' expr
'*' shift 4
'+' shift 5
$end reduce 1
')' reduce 1
. error
\end{verbatim}\end{quote}
In this state, we have successfully parsed a \verb"+" expression (rule 1).
When the next symbol is \verb"+" or \verb"*", we have the choice between the
reduction and shifting the symbol. Using the default shift/reduce
disambiguating rule, TP Yacc has resolved these conflicts in favour of shift.
Now let us assume the above precedence definition:
\begin{quote}\begin{verbatim}
%left '+'
%left '*'
\end{verbatim}\end{quote}
which gives \verb"*" higher precedence than \verb"+" and makes both operators
left-associative. The rightmost terminal in rule 1 is \verb"+". Hence, given
these precedence definitions, the first conflict will be resolved in favour
of shift (\verb"*" has higher precedence than \verb"+"), while the second one
is resolved in favour of reduce (\verb"+" is left-associative).
Similar conflicts arise in state 7:
\begin{quote}\begin{verbatim}
state 7:
*** conflicts:
shift 4, reduce 2 on '*'
shift 5, reduce 2 on '+'
expr : expr '*' expr _ (2)
expr : expr _ '+' expr
expr : expr _ '*' expr
'*' shift 4
'+' shift 5
$end reduce 2
')' reduce 2
. error
\end{verbatim}\end{quote}
Here, we have successfully parsed a \verb"*" expression which may be followed
by another \verb"+" or \verb"*" operator. Since \verb"*" is left-associative
and has higher precedence than \verb"+", both conflicts will be resolved in
favour of reduce.
Of course, you can also have different operators on the same precedence
level. For instance, consider the following extended version of the
arithmetic expression grammar:
\begin{quote}\begin{verbatim}
%token NUM
%left '+' '-'
%left '*' '/'
%%
expr : expr '+' expr
| expr '-' expr
| expr '*' expr
| expr '/' expr
| '(' expr ')'
| NUM
;
\end{verbatim}\end{quote}
This puts all ``addition'' operators on the first and all ``multiplication''
operators on the second precedence level. All operators are left-associative;
for instance, \verb"5+3-2" will be parsed as \verb"(5+3)-2".
By default, TP Yacc assigns each rule the precedence of its rightmost
terminal. This is a sensible decision in most cases. Occasionally, it
may be necessary to overwrite this default choice and explicitly assign
a precedence to a rule. This can be done by putting a precedence tag
of the form
\begin{quote}\begin{verbatim}
%prec symbol
\end{verbatim}\end{quote}
at the end of the corresponding rule which gives the rule the precedence
of the specified symbol. For instance, to extend the expression grammar
with a unary minus operator, giving it highest precedence, you may write:
\begin{quote}\begin{verbatim}
%token NUM
%left '+' '-'
%left '*' '/'
%right UMINUS
%%
expr : expr '+' expr
| expr '-' expr
| expr '*' expr
| expr '/' expr
| '-' expr %prec UMINUS
| '(' expr ')'
| NUM
;
\end{verbatim}\end{quote}
Note the use of the \verb"UMINUS" token which is not an actual input symbol
but whose sole purpose it is to give unary minus its proper precedence. If
we omitted the precedence tag, both unary and binary minus would have the
same precedence because they are represented by the same input symbol.
\subsection*{Error Handling}
Syntactic error handling is a difficult area in the design of user-friendly
parsers. Usually, you will not like to have the parser give up upon the
first occurrence of an errorneous input symbol. Instead, the parser should
recover from a syntax error, that is, it should try to find a place in the
input where it can resume the parse.
TP Yacc provides a general mechanism to implement parsers with error
recovery. A special predefined \verb"error" token may be used in grammar rules
to indicate positions where syntax errors might occur. When the parser runs
into an error action (i.e., reads an errorneous input symbol) it prints out
an error message and starts error recovery by popping its stack until it
uncovers a state in which there is a shift action on the \verb"error" token.
If there is no such state, the parser terminates with return value 1,
indicating an unrecoverable syntax error. If there is such a state, the
parser takes the shift on the \verb"error" token (pretending it has seen
an imaginary \verb"error" token in the input), and resumes parsing in a
special ``error mode.''
While in error mode, the parser quietly skips symbols until it can again
perform a legal shift action. To prevent a cascade of error messages, the
parser returns to its normal mode of operation only after it has seen
and shifted three legal input symbols. Any additional error found after
the first shifted symbol restarts error recovery, but no error message
is printed. The TP Yacc library routine \verb"yyerrok" may be used to reset
the parser to its normal mode of operation explicitly.
For a simple example, consider the rule
\begin{quote}\begin{verbatim}
stmt : error ';' { yyerrok; }
\end{verbatim}\end{quote}
and assume a syntax error occurs while a statement (nonterminal \verb"stmt")
is parsed. The parser prints an error message, then pops its stack until it
can shift the token \verb"error" of the error rule. Proceeding in error mode,
it will skip symbols until it finds a semicolon, then reduces by the error
rule. The call to \verb"yyerrok" tells the parser that we have recovered from
the error and that it should proceed with the normal parse. This kind of
``panic mode'' error recovery scheme works well when statements are always
terminated with a semicolon. The parser simply skips the ``bad'' statement
and then resumes the parse.
Implementing a good error recovery scheme can be a difficult task; see
Aho/Sethi/Ullman (1986) for a more comprehensive treatment of this topic.
Schreiner and Friedman have developed a systematic technique to implement
error recovery with Yacc which I found quite useful (I used it myself
to implement error recovery in the TP Yacc parser); see Schreiner/Friedman
(1985).
\subsection*{Yacc Library}
The TP Yacc library (\verb"YaccLib") unit provides some global declarations
used by the parser routine \verb"yyparse", and some variables and utility
routines which may be used to control the actions of the parser and to
implement error recovery. See the file \verb"YACCLIB.PAS" for a description
of these variables and routines.
You can also modify the Yacc library unit (and/or the code template in the
\verb"YYPARSE.COD" file) to customize TP Yacc to your target applications.
\subsection*{Other Features}
TP Yacc supports all additional language elements entitled as ``Old Features
Supported But not Encouraged'' in the UNIX manual, which are provided for
backward compatibility with older versions of (UNIX) Yacc:
\begin{itemize}
\item
literals delimited by double quotes.
\item
multiple-character literals. Note that these are not treated as
character sequences but represent single tokens which are given a
symbolic integer code just like any other token identifier. However,
they will not be declared in the output file, so you have to make sure
yourself that the lexical analyzer returns the correct codes for these
symbols. E.g., you might explicitly assign token numbers by using a
definition like
\begin{quote}\begin{verbatim}
%token ':=' 257
\end{verbatim}\end{quote}
at the beginning of the Yacc grammar.
\item
\verb"\" may be used instead of \verb"%", i.e. \verb"\\" means
\verb"%%", \verb"\left" is the same as \verb"%left", etc.
\item
other synonyms:
\begin{itemize}
\item \verb"%<" for \verb"%left"
\item \verb"%>" for \verb"%right"
\item \verb"%binary" or \verb"%2" for \verb"%nonassoc"
\item \verb"%term" or \verb"%0" for \verb"%token"
\item \verb"%=" for \verb"%prec"
\end{itemize}
\item
actions may also be written as \verb"= { ... }" or
\verb"= single-statement;"
\item
Turbo Pascal declarations (\verb"%{ ... %}") may be put at the
beginning of the rules section. They will be treated as local
declarations of the actions routine.
\end{itemize}
\subsection*{Implementation Restrictions}
As with TP Lex, internal table sizes and the main memory available limit the
complexity of source grammars that TP Yacc can handle. There is currently no
possibility to change internal table sizes (apart from modifying the sources
of TP Yacc itself) or to make use of extended memory. However, the maximum
table sizes provided by TP Yacc are large enough to handle quite complex
grammars (such as the Pascal grammar in the TP Yacc distribution). The
current limits are 600 s (states), 2400 i (LR0 kernel items), 2400 t (shift
and goto transitions) and 1200 r (reductions).
The default stack size of the generated parsers is \verb"yymaxdepth = 1024",
as declared in the TP Yacc library unit. This should be sufficient for any
average application, but you can change the stack size by including a
corresponding declaration in the definitions part of the Yacc grammar
(or change the value in the \verb"YaccLib" unit). Note that right-recursive
grammar rules may increase stack space requirements, so it is a good
idea to use left-recursive rules wherever possible.
\subsection*{Differences from UNIX Yacc}
Major differences between TP Yacc and UNIX Yacc are listed below.
\begin{itemize}
\item
TP Yacc produces output code for Turbo Pascal, rather than for C.
\item
TP Yacc does not support \verb"%union" definitions. Instead, a value
type is declared by specifying the type identifier itself as the tag of
a \verb"%token" or \verb"%type" definition. TP Yacc will automatically
generate an appropriate variant record type (\verb"YYSType") which is
capable of holding values of any of the types used in \verb"%token" and
\verb"%type".
Type checking is very strict. If you use type definitions, then
any symbol referred to in an action must have a type introduced
in a type definition. Either the symbol must have been assigned a
type in the definitions section, or the \verb"$<type-identifier>"
notation must be used. The syntax of the \verb"%type" definition has
been changed slightly to allow definitions of the form
\begin{quote}\begin{verbatim}
%type <type-identifier>
\end{verbatim}\end{quote}
(omitting the nonterminals) which may be used to declare types which
are not assigned to any gra
?? 快捷鍵說明
復(fù)制代碼
Ctrl + C
搜索代碼
Ctrl + F
全屏模式
F11
切換主題
Ctrl + Shift + D
顯示快捷鍵
?
增大字號
Ctrl + =
減小字號
Ctrl + -