?? lemon.html
字號:
destructors work. The example shows a non-terminal named``nt'' that holds values of type ``void*''. When the rule foran ``nt'' reduces, it sets the value of the non-terminal tospace obtained from malloc(). Later, when the nt non-terminalis popped from the stack, the destructor will fire and callfree() on this malloced space, thus avoiding a memory leak.(Note that the symbol ``$$'' in the destructor code is replacedby the value of the non-terminal.)</p><p>It is important to note that the value of a non-terminal is passedto the destructor whenever the non-terminal is removed from thestack, unless the non-terminal is used in a C-code action. Ifthe non-terminal is used by C-code, then it is assumed that theC-code will take care of destroying it if it should reallybe destroyed. More commonly, the value is used to build somelarger structure and we don't want to destroy it, which is whythe destructor is not called in this circumstance.</p><p>By appropriate use of destructors, it is possible tobuild a parser using Lemon that can be used within a long-runningprogram, such as a GUI, that will not leak memory or other resources.To do the same using yacc or bison is much more difficult.</p><h4>The <tt>%extra_argument</tt> directive</h4>The %extra_argument directive instructs Lemon to add a 4th parameterto the parameter list of the Parse() function it generates. Lemondoesn't do anything itself with this extra argument, but it doesmake the argument available to C-code action routines, destructors,and so forth. For example, if the grammar file contains:</p><p><pre> %extra_argument { MyStruct *pAbc }</pre></p><p>Then the Parse() function generated will have an 4th parameterof type ``MyStruct*'' and all action routines will have access toa variable named ``pAbc'' that is the value of the 4th parameterin the most recent call to Parse().</p><h4>The <tt>%include</tt> directive</h4><p>The %include directive specifies C code that is included at thetop of the generated parser. You can include any text you want --the Lemon parser generator copies it blindly. If you have multiple%include directives in your grammar file the value of the last%include directive overwrites all the others.</p.<p>The %include directive is very handy for getting some extra #includepreprocessor statements at the beginning of the generated parser.For example:</p><p><pre> %include {#include <unistd.h>}</pre></p><p>This might be needed, for example, if some of the C actions in thegrammar call functions that are prototyed in unistd.h.</p><h4>The <tt>%left</tt> directive</h4>The %left directive is used (along with the %right and%nonassoc directives) to declare precedences of terminalsymbols. Every terminal symbol whose name appears aftera %left directive but before the next period (``.'') isgiven the same left-associative precedence value. Subsequent%left directives have higher precedence. For example:</p><p><pre> %left AND. %left OR. %nonassoc EQ NE GT GE LT LE. %left PLUS MINUS. %left TIMES DIVIDE MOD. %right EXP NOT.</pre></p><p>Note the period that terminates each %left, %right or %nonassocdirective.</p><p>LALR(1) grammars can get into a situation where they requirea large amount of stack space if you make heavy use or right-associativeoperators. For this reason, it is recommended that you use %leftrather than %right whenever possible.</p><h4>The <tt>%name</tt> directive</h4><p>By default, the functions generated by Lemon all begin with thefive-character string ``Parse''. You can change this string to somethingdifferent using the %name directive. For instance:</p><p><pre> %name Abcde</pre></p><p>Putting this directive in the grammar file will cause Lemon to generatefunctions named<ul><li> AbcdeAlloc(),<li> AbcdeFree(),<li> AbcdeTrace(), and<li> Abcde().</ul>The %name directive allows you to generator two or more differentparsers and link them all into the same executable.</p><h4>The <tt>%nonassoc</tt> directive</h4><p>This directive is used to assign non-associative precedence toone or more terminal symbols. See the section on precedence rulesor on the %left directive for additional information.</p><h4>The <tt>%parse_accept</tt> directive</h4><p>The %parse_accept directive specifies a block of C code that isexecuted whenever the parser accepts its input string. To ``accept''an input string means that the parser was able to process all tokenswithout error.</p><p>For example:</p><p><pre> %parse_accept { printf("parsing complete!\n"); }</pre></p><h4>The <tt>%parse_failure</tt> directive</h4><p>The %parse_failure directive specifies a block of C code thatis executed whenever the parser fails complete. This code is notexecuted until the parser has tried and failed to resolve an inputerror using is usual error recovery strategy. The routine isonly invoked when parsing is unable to continue.</p><p><pre> %parse_failure { fprintf(stderr,"Giving up. Parser is hopelessly lost...\n"); }</pre></p><h4>The <tt>%right</tt> directive</h4><p>This directive is used to assign right-associative precedence toone or more terminal symbols. See the section on precedence rulesor on the %left directive for additional information.</p><h4>The <tt>%stack_overflow</tt> directive</h4><p>The %stack_overflow directive specifies a block of C code thatis executed if the parser's internal stack ever overflows. Typicallythis just prints an error message. After a stack overflow, the parserwill be unable to continue and must be reset.</p><p><pre> %stack_overflow { fprintf(stderr,"Giving up. Parser stack overflow\n"); }</pre></p><p>You can help prevent parser stack overflows by avoiding the useof right recursion and right-precedence operators in your grammar.Use left recursion and and left-precedence operators instead, toencourage rules to reduce sooner and keep the stack size down.For example, do rules like this:<pre> list ::= list element. // left-recursion. Good! list ::= .</pre>Not like this:<pre> list ::= element list. // right-recursion. Bad! list ::= .</pre><h4>The <tt>%stack_size</tt> directive</h4><p>If stack overflow is a problem and you can't resolve the troubleby using left-recursion, then you might want to increase the sizeof the parser's stack using this directive. Put an positive integerafter the %stack_size directive and Lemon will generate a parsewith a stack of the requested size. The default value is 100.</p><p><pre> %stack_size 2000</pre></p><h4>The <tt>%start_symbol</tt> directive</h4><p>By default, the start-symbol for the grammar that Lemon generatesis the first non-terminal that appears in the grammar file. But youcan choose a different start-symbol using the %start_symbol directive.</p><p><pre> %start_symbol prog</pre></p><h4>The <tt>%token_destructor</tt> directive</h4><p>The %destructor directive assigns a destructor to a non-terminalsymbol. (See the description of the %destructor directive above.)This directive does the same thing for all terminal symbols.</p><p>Unlike non-terminal symbols which may each have a different data typefor their values, terminals all use the same data type (defined bythe %token_type directive) and so they use a common destructor. Otherthan that, the token destructor works just like the non-terminaldestructors.</p><h4>The <tt>%token_prefix</tt> directive</h4><p>Lemon generates #defines that assign small integer constantsto each terminal symbol in the grammar. If desired, Lemon willadd a prefix specified by this directiveto each of the #defines it generates.So if the default output of Lemon looked like this:<pre> #define AND 1 #define MINUS 2 #define OR 3 #define PLUS 4</pre>You can insert a statement into the grammar like this:<pre> %token_prefix TOKEN_</pre>to cause Lemon to produce these symbols instead:<pre> #define TOKEN_AND 1 #define TOKEN_MINUS 2 #define TOKEN_OR 3 #define TOKEN_PLUS 4</pre><h4>The <tt>%token_type</tt> and <tt>%type</tt> directives</h4><p>These directives are used to specify the data types for valueson the parser's stack associated with terminal and non-terminalsymbols. The values of all terminal symbols must be of the sametype. This turns out to be the same data type as the 3rd parameterto the Parse() function generated by Lemon. Typically, you willmake the value of a terminal symbol by a pointer to some kind oftoken structure. Like this:</p><p><pre> %token_type {Token*}</pre></p><p>If the data type of terminals is not specified, the default valueis ``int''.</p><p>Non-terminal symbols can each have their own data types. Typicallythe data type of a non-terminal is a pointer to the root of a parse-treestructure that contains all information about that non-terminal.For example:</p><p><pre> %type expr {Expr*}</pre></p><p>Each entry on the parser's stack is actually a union containinginstances of all data types for every non-terminal and terminal symbol.Lemon will automatically use the correct element of this union dependingon what the corresponding non-terminal or terminal symbol is. Butthe grammar designer should keep in mind that the size of the unionwill be the size of its largest element. So if you have a singlenon-terminal whose data type requires 1K of storage, then your 100entry parser stack will require 100K of heap space. If you are willingand able to pay that price, fine. You just need to know.</p><h3>Error Processing</h3><p>After extensive experimentation over several years, it has beendiscovered that the error recovery strategy used by yacc is aboutas good as it gets. And so that is what Lemon uses.</p><p>When a Lemon-generated parser encounters a syntax error, itfirst invokes the code specified by the %syntax_error directive, ifany. It then enters its error recovery strategy. The error recoverystrategy is to begin popping the parsers stack until it enters astate where it is permitted to shift a special non-terminal symbolnamed ``error''. It then shifts this non-terminal and continuesparsing. But the %syntax_error routine will not be called againuntil at least three new tokens have been successfully shifted.</p><p>If the parser pops its stack until the stack is empty, and it stillis unable to shift the error symbol, then the %parse_failed routineis invoked and the parser resets itself to its start state, readyto begin parsing a new file. This is what will happen at the veryfirst syntax error, of course, if there are no instances of the ``error'' non-terminal in your grammar.</p></body></html>
?? 快捷鍵說明
復制代碼
Ctrl + C
搜索代碼
Ctrl + F
全屏模式
F11
切換主題
Ctrl + Shift + D
顯示快捷鍵
?
增大字號
Ctrl + =
減小字號
Ctrl + -