?? lemon.html
字號:
<p>Yacc and bison allow terminal symbols to have either alphanumericnames or to be individual characters included in single quotes, likethis: ')' or '$'. Lemon does not allow this alternative form forterminal symbols. With Lemon, all symbols, terminals and nonterminals,must have alphanumeric names.</p><h3>Grammar Rules</h3><p>The main component of a Lemon grammar file is a sequence of grammarrules.Each grammar rule consists of a nonterminal symbol followed bythe special symbol ``::='' and then a list of terminals and/or nonterminals.The rule is terminated by a period.The list of terminals and nonterminals on the right-hand side of therule can be empty.Rules can occur in any order, except that the left-hand side of thefirst rule is assumed to be the start symbol for the grammar (unlessspecified otherwise using the <tt>%start</tt> directive described below.)A typical sequence of grammar rules might look something like this:<pre> expr ::= expr PLUS expr. expr ::= expr TIMES expr. expr ::= LPAREN expr RPAREN. expr ::= VALUE.</pre></p><p>There is one non-terminal in this example, ``expr'', and fiveterminal symbols or tokens: ``PLUS'', ``TIMES'', ``LPAREN'',``RPAREN'' and ``VALUE''.</p><p>Like yacc and bison, Lemon allows the grammar to specify a blockof C code that will be executed whenever a grammar rule is reducedby the parser.In Lemon, this action is specified by putting the C code (containedwithin curly braces <tt>{...}</tt>) immediately after theperiod that closes the rule.For example:<pre> expr ::= expr PLUS expr. { printf("Doing an addition...\n"); }</pre></p><p>In order to be useful, grammar actions must normally be linked totheir associated grammar rules.In yacc and bison, this is accomplished by embedding a ``$$'' in theaction to stand for the value of the left-hand side of the rule andsymbols ``$1'', ``$2'', and so forth to stand for the value ofthe terminal or nonterminal at position 1, 2 and so forth on theright-hand side of the rule.This idea is very powerful, but it is also very error-prone. Thesingle most common source of errors in a yacc or bison grammar isto miscount the number of symbols on the right-hand side of a grammarrule and say ``$7'' when you really mean ``$8''.</p><p>Lemon avoids the need to count grammar symbols by assigning symbolicnames to each symbol in a grammar rule and then using those symbolicnames in the action.In yacc or bison, one would write this:<pre> expr -> expr PLUS expr { $$ = $1 + $3; };</pre>But in Lemon, the same rule becomes the following:<pre> expr(A) ::= expr(B) PLUS expr(C). { A = B+C; }</pre>In the Lemon rule, any symbol in parentheses after a grammar rulesymbol becomes a place holder for that symbol in the grammar rule.This place holder can then be used in the associated C action tostand for the value of that symbol.<p><p>The Lemon notation for linking a grammar rule with its reduceaction is superior to yacc/bison on several counts.First, as mentioned above, the Lemon method avoids the need tocount grammar symbols.Secondly, if a terminal or nonterminal in a Lemon grammar ruleincludes a linking symbol in parentheses but that linking symbolis not actually used in the reduce action, then an error messageis generated.For example, the rule<pre> expr(A) ::= expr(B) PLUS expr(C). { A = B; }</pre>will generate an error because the linking symbol ``C'' is usedin the grammar rule but not in the reduce action.</p><p>The Lemon notation for linking grammar rules to reduce actionsalso facilitates the use of destructors for reclaiming memoryallocated by the values of terminals and nonterminals on theright-hand side of a rule.</p><h3>Precedence Rules</h3><p>Lemon resolves parsing ambiguities in exactly the same way asyacc and bison. A shift-reduce conflict is resolved in favorof the shift, and a reduce-reduce conflict is resolved by reducingwhichever rule comes first in the grammar file.</p><p>Just like inyacc and bison, Lemon allows a measure of control over the resolution of paring conflicts using precedence rules.A precedence value can be assigned to any terminal symbolusing the %left, %right or %nonassoc directives. Terminal symbolsmentioned in earlier directives have a lower precedence thatterminal symbols mentioned in later directives. For example:</p><p><pre> %left AND. %left OR. %nonassoc EQ NE GT GE LT LE. %left PLUS MINUS. %left TIMES DIVIDE MOD. %right EXP NOT.</pre></p><p>In the preceding sequence of directives, the AND operator isdefined to have the lowest precedence. The OR operator is oneprecedence level higher. And so forth. Hence, the grammar wouldattempt to group the ambiguous expression<pre> a AND b OR c</pre>like this<pre> a AND (b OR c).</pre>The associativity (left, right or nonassoc) is used to determinethe grouping when the precedence is the same. AND is left-associativein our example, so<pre> a AND b AND c</pre>is parsed like this<pre> (a AND b) AND c.</pre>The EXP operator is right-associative, though, so<pre> a EXP b EXP c</pre>is parsed like this<pre> a EXP (b EXP c).</pre>The nonassoc precedence is used for non-associative operators.So<pre> a EQ b EQ c</pre>is an error.</p><p>The precedence of non-terminals is transferred to rules as follows:The precedence of a grammar rule is equal to the precedence of theleft-most terminal symbol in the rule for which a precedence isdefined. This is normally what you want, but in those cases whereyou want to precedence of a grammar rule to be something different,you can specify an alternative precedence symbol by putting thesymbol in square braces after the period at the end of the rule andbefore any C-code. For example:</p><p><pre> expr = MINUS expr. [NOT]</pre></p><p>This rule has a precedence equal to that of the NOT symbol, not theMINUS symbol as would have been the case by default.</p><p>With the knowledge of how precedence is assigned to terminalsymbols and individualgrammar rules, we can now explain precisely how parsing conflictsare resolved in Lemon. Shift-reduce conflicts are resolvedas follows:<ul><li> If either the token to be shifted or the rule to be reduced lacks precedence information, then resolve in favor of the shift, but report a parsing conflict.<li> If the precedence of the token to be shifted is greater than the precedence of the rule to reduce, then resolve in favor of the shift. No parsing conflict is reported.<li> If the precedence of the token it be shifted is less than the precedence of the rule to reduce, then resolve in favor of the reduce action. No parsing conflict is reported.<li> If the precedences are the same and the shift token is right-associative, then resolve in favor of the shift. No parsing conflict is reported.<li> If the precedences are the same the the shift token is left-associative, then resolve in favor of the reduce. No parsing conflict is reported.<li> Otherwise, resolve the conflict by doing the shift and report the parsing conflict.</ul>Reduce-reduce conflicts are resolved this way:<ul><li> If either reduce rule lacks precedence information, then resolve in favor of the rule that appears first in the grammar and report a parsing conflict.<li> If both rules have precedence and the precedence is different then resolve the dispute in favor of the rule with the highest precedence and do not report a conflict.<li> Otherwise, resolve the conflict by reducing by the rule that appears first in the grammar and report a parsing conflict.</ul><h3>Special Directives</h3><p>The input grammar to Lemon consists of grammar rules and specialdirectives. We've described all the grammar rules, so now we'lltalk about the special directives.</p><p>Directives in lemon can occur in any order. You can put them beforethe grammar rules, or after the grammar rules, or in the mist of thegrammar rules. It doesn't matter. The relative order ofdirectives used to assign precedence to terminals is important, butother than that, the order of directives in Lemon is arbitrary.</p><p>Lemon supports the following special directives:<ul><li><tt>%code</tt><li><tt>%default_destructor</tt><li><tt>%default_type</tt><li><tt>%destructor</tt><li><tt>%extra_argument</tt><li><tt>%include</tt><li><tt>%left</tt><li><tt>%name</tt><li><tt>%nonassoc</tt><li><tt>%parse_accept</tt><li><tt>%parse_failure </tt><li><tt>%right</tt><li><tt>%stack_overflow</tt><li><tt>%stack_size</tt><li><tt>%start_symbol</tt><li><tt>%syntax_error</tt><li><tt>%token_prefix</tt><li><tt>%token_type</tt><li><tt>%type</tt></ul>Each of these directives will be described separately in thefollowing sections:</p><h4>The <tt>%code</tt> directive</h4><p>The %code directive is used to specify addition C/C++ code thatis added to the end of the main output file. This is similar tothe %include directive except that %include is inserted at thebeginning of the main output file.</p><p>%code is typically used to include some action routines or perhapsa tokenizer as part of the output file.</p><h4>The <tt>%default_destructor</tt> directive</h4><p>The %default_destructor directive specifies a destructor to use for non-terminals that do not have their own destructorspecified by a separate %destructor directive. See the documentationon the %destructor directive below for additional information.</p><p>In some grammers, many different non-terminal symbols have thesame datatype and hence the same destructor. This directive isa convenience way to specify the same destructor for all thosenon-terminals using a single statement.</p><h4>The <tt>%default_type</tt> directive</h4><p>The %default_type directive specifies the datatype of non-terminalsymbols that do no have their own datatype defined using a separate%type directive. See the documentation on %type below for additioninformation.</p><h4>The <tt>%destructor</tt> directive</h4><p>The %destructor directive is used to specify a destructor fora non-terminal symbol.(See also the %token_destructor directive which is used tospecify a destructor for terminal symbols.)</p><p>A non-terminal's destructor is called to dispose of thenon-terminal's value whenever the non-terminal is popped fromthe stack. This includes all of the following circumstances:<ul><li> When a rule reduces and the value of a non-terminal on the right-hand side is not linked to C code.<li> When the stack is popped during error processing.<li> When the ParseFree() function runs.</ul>The destructor can do whatever it wants with the value ofthe non-terminal, but its design is to deallocate memoryor other resources held by that non-terminal.</p><p>Consider an example:<pre> %type nt {void*} %destructor nt { free($$); } nt(A) ::= ID NUM. { A = malloc( 100 ); }</pre>This example is a bit contrived but it serves to illustrate how
?? 快捷鍵說明
復(fù)制代碼
Ctrl + C
搜索代碼
Ctrl + F
全屏模式
F11
切換主題
Ctrl + Shift + D
顯示快捷鍵
?
增大字號
Ctrl + =
減小字號
Ctrl + -