?? ss4
字號:
.SH4: How the Parser Works.PPYacc turns the specification file into a C program, whichparses the input according to the specification given.The algorithm used to go from thespecification to the parser is complex, and will not be discussedhere (seethe references for more information).The parser itself, however, is relatively simple,and understanding how it works, whilenot strictly necessary, will nevertheless maketreatment of error recovery and ambiguities much morecomprehensible..PPThe parser produced by Yacc consistsof a finite state machine with a stack.The parser is also capable of reading and remembering the nextinput token (called the.I lookaheadtoken).The.I "current state"is always the one on the top of the stack.The states of the finite state machine are givensmall integer labels; initially, the machine is in state 0,the stack contains only state 0, and no lookahead token has been read..PPThe machine has only four actions available to it, called.I shift ,.I reduce ,.I accept ,and.I error .A move of the parser is done as follows:.IP 1.Based on its current state, the parser decideswhether it needs a lookahead token to decidewhat action should be done; if it needs one, and doesnot have one, it calls.I yylexto obtain the next token..IP 2.Using the current state, and the lookahead tokenif needed, the parser decides on its next action, andcarries it out.This may result in states being pushed onto the stack, or popped off ofthe stack, and in the lookahead token being processedor left alone..PPThe.I shiftaction is the most common action the parser takes.Whenever a shift action is taken, there is alwaysa lookahead token.For example, in state 56 there may bean action:.DS IF shift 34.DEwhich says, in state 56, if the lookahead token is IF,the current state (56) is pushed down on the stack,and state 34 becomes the current state (on thetop of the stack).The lookahead token is cleared..PPThe.I reduceaction keeps the stack from growing withoutbounds.Reduce actions are appropriate when the parser has seenthe right hand side of a grammar rule,and is prepared to announce that it has seenan instance of the rule, replacing the right hand sideby the left hand side.It may be necessary to consult the lookahead tokento decide whether to reduce, butusually it is not; in fact, the defaultaction (represented by a ``.'') is often a reduce action..PPReduce actions are associated with individual grammar rules.Grammar rules are also given small integernumbers, leading to some confusion.The action.DS \fB.\fR reduce 18.DErefers to.I "grammar rule"18, while the action.DS IF shift 34.DErefers to.I state34..PPSuppose the rule being reduced is.DSA \fB:\fR x y z ;.DEThe reduce action depends on theleft hand symbol (A in this case), and the number ofsymbols on the right hand side (three in this case).To reduce, first pop off the top three statesfrom the stack(In general, the number of states popped equals the number of symbols on theright side of the rule).In effect, these states were the onesput on the stack while recognizing.I x ,.I y ,and.I z ,and no longer serve any useful purpose.After popping these states, a state is uncoveredwhich was the state the parser was in before beginning toprocess the rule.Using this uncovered state, and the symbolon the left side of the rule, perform what is ineffect a shift of A.A new state is obtained, pushed onto the stack, and parsing continues.There are significant differences between the processing ofthe left hand symbol and an ordinary shift of a token,however, so this action is called a.I gotoaction.In particular, the lookahead token is cleared by a shift, andis not affected by a goto.In any case, the uncovered state contains an entry such as:.DS A goto 20.DEcausing state 20 to be pushedonto the stack, and become the current state..PPIn effect, the reduce action ``turns back the clock'' in the parse,popping the states off the stack to go back to thestate where the right hand side of the rule was first seen.The parser then behaves as if it had seen the left side at that time.If the right hand side of the rule is empty,no states are popped off of the stack: the uncovered stateis in fact the current state..PPThe reduce action is also important in the treatment of user-suppliedactions and values.When a rule is reduced, the code supplied with the rule is executedbefore the stack is adjusted.In addition to the stack holding the states, another stack,running in parallel with it, holds the values returnedfrom the lexical analyzer and the actions.When a shift takes place, the external variable.I yylvalis copied onto the value stack.After the return from the user code, the reduction is carried out.When the.I gotoaction is done, the external variable.I yyvalis copied onto the value stack.The pseudo-variables $1, $2, etc., refer to the value stack..PPThe other two parser actions are conceptually much simpler.The.I acceptaction indicates that the entire input has been seen andthat it matches the specification.This action appears only when the lookahead token is the endmarker, and indicates that the parser has successfullydone its job.The.I erroraction, on the other hand, represents a place where the parsercan no longer continue parsing according to the specification.The input tokens it has seen, together with the lookahead token,cannot be followed by anything that would resultin a legal input.The parser reports an error, and attempts to recover the situation andresume parsing: the error recovery (as opposed to the detection of error)will be covered in Section 7..PPIt is time for an example!Consider the specification.DS%token DING DONG DELL%%rhyme : sound place ;sound : DING DONG ;place : DELL ;.DE.PPWhen Yacc is invoked with the.B \-voption, a file called.I y.outputis produced, with a human-readable description of the parser.The.I y.outputfile corresponding to the above grammar (with some statisticsstripped off the end) is:.DSstate 0 $accept : \_rhyme $end DING shift 3 . error rhyme goto 1 sound goto 2state 1 $accept : rhyme\_$end $end accept . errorstate 2 rhyme : sound\_place DELL shift 5 . error place goto 4state 3 sound : DING\_DONG DONG shift 6 . errorstate 4 rhyme : sound place\_ (1) . reduce 1state 5 place : DELL\_ (3) . reduce 3state 6 sound : DING DONG\_ (2) . reduce 2.DENotice that, in addition to the actions for each state, there is adescription of the parsing rules being processed in eachstate. The \_ character is used to indicatewhat has been seen, and what is yet to come, in each rule.Suppose the input is.DSDING DONG DELL.DEIt is instructive to follow the steps of the parser whileprocessing this input..PPInitially, the current state is state 0.The parser needs to refer to the input in order to decidebetween the actions available in state 0, sothe first token,.I DING ,is read, becoming the lookahead token.The action in state 0 on.I DINGisis ``shift 3'', so state 3 is pushed onto the stack,and the lookahead token is cleared.State 3 becomes the current state.The next token,.I DONG ,is read, becoming the lookahead token.The action in state 3 on the token.I DONGis ``shift 6'',so state 6 is pushed onto the stack, and the lookahead is cleared.The stack now contains 0, 3, and 6.In state 6, without even consulting the lookahead,the parser reduces by rule 2..DS sound : DING DONG.DEThis rule has two symbols on the right hand side, sotwo states, 6 and 3, are popped off of the stack, uncovering state 0.Consulting the description of state 0, looking for a goto on .I sound ,.DS sound goto 2.DEis obtained; thus state 2 is pushed onto the stack,becoming the current state..PPIn state 2, the next token,.I DELL ,must be read.The action is ``shift 5'', so state 5 is pushed onto the stack, which now has0, 2, and 5 on it, and the lookahead token is cleared.In state 5, the only action is to reduce by rule 3.This has one symbol on the right hand side, so one state, 5,is popped off, and state 2 is uncovered.The goto in state 2 on.I place ,the left side of rule 3,is state 4.Now, the stack contains 0, 2, and 4.In state 4, the only action is to reduce by rule 1.There are two symbols on the right, so the top two states are popped off,uncovering state 0 again.In state 0, there is a goto on.I rhymecausing the parser to enter state 1.In state 1, the input is read; the endmarker is obtained,indicated by ``$end'' in the.I y.outputfile.The action in state 1 when the endmarker is seen is to accept,successfully ending the parse..PPThe reader is urged to consider how the parser workswhen confronted with such incorrect strings as.I "DING DONG DONG" ,.I "DING DONG" ,.I "DING DONG DELL DELL" ,etc.A few minutes spend with this and other simple examples willprobably be repaid when problems arise in more complicated contexts.
?? 快捷鍵說明
復(fù)制代碼
Ctrl + C
搜索代碼
Ctrl + F
全屏模式
F11
切換主題
Ctrl + Shift + D
顯示快捷鍵
?
增大字號
Ctrl + =
減小字號
Ctrl + -