?? yacc-docs.txt
字號:
In effect, the reduce action ``turns back the clock'' in the
parse, popping the states off the stack to go back to the state
where the right hand side of the rule was first seen. The parser
then behaves as if it had seen the left side at that time. If
the right hand side of the rule is empty, no states are popped
off of the stack: the uncovered state is in fact the current
state.
The reduce action is also important in the treatment of
user-supplied actions and values. When a rule is reduced, the
code supplied with the rule is executed before the stack is
adjusted. In addition to the stack holding the states, another
stack, running in parallel with it, holds the values returned
from the lexical analyzer and the actions. When a shift takes
place, the external variable yylval is copied onto the value
stack. After the return from the user code, the reduction is
carried out. When the goto action is done, the external variable
yyval is copied onto the value stack. The pseudo-variables $1,
$2, etc., refer to the value stack.
The other two parser actions are conceptually much simpler.
The accept action indicates that the entire input has been seen
and that it matches the specification. This action appears only
when the lookahead token is the endmarker, and indicates that the
parser has successfully done its job. The error action, on the
other hand, represents a place where the parser can no longer
continue parsing according to the specification. The input
Yacc: Yet Another Compiler-Compiler PS1:15-13
tokens it has seen, together with the lookahead token, cannot be
followed by anything that would result in a legal input. The
parser reports an error, and attempts to recover the situation
and resume parsing: the error recovery (as opposed to the detec-
tion of error) will be covered in Section 7.
It is time for an example! Consider the specification
%token DING DONG DELL
%%
rhyme : sound place
;
sound : DING DONG
;
place : DELL
;
When Yacc is invoked with the -v option, a file called
y.output is produced, with a human-readable description of the
parser. The y.output file corresponding to the above grammar
(with some statistics stripped off the end) is:
PS1:15-14 Yacc: Yet Another Compiler-Compiler
state 0
$accept : _rhyme $end
DING shift 3
. error
rhyme goto 1
sound goto 2
state 1
$accept : rhyme_$end
$end accept
. error
state 2
rhyme : sound_place
DELL shift 5
. error
place goto 4
state 3
sound : DING_DONG
DONG shift 6
. error
state 4
rhyme : sound place_ (1)
. reduce 1
state 5
place : DELL_ (3)
. reduce 3
state 6
sound : DING DONG_ (2)
. reduce 2
Notice that, in addition to the actions for each state, there is
a description of the parsing rules being processed in each state.
The _ character is used to indicate what has been seen, and what
is yet to come, in each rule. Suppose the input is
DING DONG DELL
It is instructive to follow the steps of the parser while pro-
cessing this input.
Yacc: Yet Another Compiler-Compiler PS1:15-15
Initially, the current state is state 0. The parser needs
to refer to the input in order to decide between the actions
available in state 0, so the first token, DING, is read, becoming
the lookahead token. The action in state 0 on DING is is ``shift
3'', so state 3 is pushed onto the stack, and the lookahead token
is cleared. State 3 becomes the current state. The next token,
DONG, is read, becoming the lookahead token. The action in state
3 on the token DONG is ``shift 6'', so state 6 is pushed onto the
stack, and the lookahead is cleared. The stack now contains 0,
3, and 6. In state 6, without even consulting the lookahead, the
parser reduces by rule 2.
sound : DING DONG
This rule has two symbols on the right hand side, so two states,
6 and 3, are popped off of the stack, uncovering state 0. Con-
sulting the description of state 0, looking for a goto on sound,
sound goto 2
is obtained; thus state 2 is pushed onto the stack, becoming the
current state.
In state 2, the next token, DELL, must be read. The action
is ``shift 5'', so state 5 is pushed onto the stack, which now
has 0, 2, and 5 on it, and the lookahead token is cleared. In
state 5, the only action is to reduce by rule 3. This has one
symbol on the right hand side, so one state, 5, is popped off,
and state 2 is uncovered. The goto in state 2 on place, the left
side of rule 3, is state 4. Now, the stack contains 0, 2, and 4.
In state 4, the only action is to reduce by rule 1. There are
two symbols on the right, so the top two states are popped off,
uncovering state 0 again. In state 0, there is a goto on rhyme
causing the parser to enter state 1. In state 1, the input is
read; the endmarker is obtained, indicated by ``$end'' in the
y.output file. The action in state 1 when the endmarker is seen
is to accept, successfully ending the parse.
The reader is urged to consider how the parser works when
confronted with such incorrect strings as DING DONG DONG, DING
DONG, DING DONG DELL DELL, etc. A few minutes spend with this
and other simple examples will probably be repaid when problems
arise in more complicated contexts.
5: Ambiguity and Conflicts
A set of grammar rules is ambiguous if there is some input
string that can be structured in two or more different ways. For
example, the grammar rule
expr : expr '-' expr
is a natural way of expressing the fact that one way of forming
an arithmetic expression is to put two other expressions together
PS1:15-16 Yacc: Yet Another Compiler-Compiler
with a minus sign between them. Unfortunately, this grammar rule
does not completely specify the way that all complex inputs
should be structured. For example, if the input is
expr - expr - expr
the rule allows this input to be structured as either
( expr - expr ) - expr
or as
expr - ( expr - expr )
(The first is called left association, the second right associa-
tion).
Yacc detects such ambiguities when it is attempting to build
the parser. It is instructive to consider the problem that con-
fronts the parser when it is given an input such as
expr - expr - expr
When the parser has read the second expr, the input that it has
seen:
expr - expr
matches the right side of the grammar rule above. The parser
could reduce the input by applying this rule; after applying the
rule; the input is reduced to expr(the left side of the rule).
The parser would then read the final part of the input:
- expr
and again reduce. The effect of this is to take the left associ-
ative interpretation.
Alternatively, when the parser has seen
expr - expr
it could defer the immediate application of the rule, and con-
tinue reading the input until it had seen
expr - expr - expr
It could then apply the rule to the rightmost three symbols,
reducing them to expr and leaving
expr - expr
Now the rule can be reduced once more; the effect is to take the
right associative interpretation. Thus, having read
Yacc: Yet Another Compiler-Compiler PS1:15-17
expr - expr
the parser can do two legal things, a shift or a reduction, and
has no way of deciding between them. This is called a shift /
reduce conflict. It may also happen that the parser has a choice
of two legal reductions; this is called a reduce / reduce con-
flict. Note that there are never any ``Shift/shift'' conflicts.
When there are shift/reduce or reduce/reduce conflicts, Yacc
still produces a parser. It does this by selecting one of the
valid steps wherever it has a choice. A rule describing which
choice to make in a given situation is called a disambiguating
rule.
Yacc invokes two disambiguating rules by default:
1. In a shift/reduce conflict, the default is to do the shift.
2. In a reduce/reduce conflict, the default is to reduce by the
earlier grammar rule (in the input sequence).
Rule 1 implies that reductions are deferred whenever there
is a choice, in favor of shifts. Rule 2 gives the user rather
crude control over the behavior of the parser in this situation,
but reduce/reduce conflicts should be avoided whenever possible.
Conflicts may arise because of mistakes in input or logic,
or because the grammar rules, while consistent, require a more
complex parser than Yacc can construct. The use of actions
within rules can also cause conflicts, if the action must be done
before the parser can be sure which rule is being recognized. In
these cases, the application of disambiguating rules is inap-
propriate, and leads to an incorrect parser. For this reason,
Yacc always reports the number of shift/reduce and reduce/reduce
conflicts resolved by Rule 1 and Rule 2.
In general, whenever it is possible to apply disambiguating
rules to produce a correct parser, it is also possible to rewrite
the grammar rules so that the same inputs are read but there are
no conflicts. For this reason, most previous parser generators
have considered conflicts to be fatal errors. Our experience has
suggested that this rewriting is somewhat unnatural, and produces
slower parsers; thus, Yacc will produce parsers even in the pres-
ence of conflicts.
As an example of the power of disambiguating rules, consider
a fragment from a programming language involving an ``if-then-
else'' construction:
stat : IF '(' cond ')' stat
| IF '(' cond ')' stat ELSE stat
;
PS1:15-18 Yacc: Yet Another Compiler-Compiler
In these rules, IF and ELSE are tokens, cond is a nonterminal
symbol describing conditional (logical) expressions, and stat is
a nonterminal symbol describing statements. The first rule will
be called the simple-if rule, and the second the if-else rule.
These two rules form an ambiguous construction, since input
of the form
IF ( C1 ) IF ( C2 ) S1 ELSE S2
?? 快捷鍵說明
復制代碼
Ctrl + C
搜索代碼
Ctrl + F
全屏模式
F11
切換主題
Ctrl + Shift + D
顯示快捷鍵
?
增大字號
Ctrl + =
減小字號
Ctrl + -