?? sec-using.html
字號:
<HTML
><HEAD
><TITLE
>Using Happy</TITLE
><META
NAME="GENERATOR"
CONTENT="Modular DocBook HTML Stylesheet Version 1.74b"><LINK
REL="HOME"
TITLE="Happy User Guide"
HREF="happy.html"><LINK
REL="PREVIOUS"
TITLE="Obtaining Happy"
HREF="sec-obtaining.html"><LINK
REL="NEXT"
TITLE="Parsing sequences"
HREF="sec-sequences.html"></HEAD
><BODY
CLASS="CHAPTER"
><DIV
CLASS="NAVHEADER"
><TABLE
SUMMARY="Header navigation table"
WIDTH="100%"
BORDER="0"
CELLPADDING="0"
CELLSPACING="0"
><TR
><TH
COLSPAN="3"
ALIGN="center"
>Happy User Guide</TH
></TR
><TR
><TD
WIDTH="10%"
ALIGN="left"
VALIGN="bottom"
><A
HREF="sec-obtaining.html"
ACCESSKEY="P"
>Prev</A
></TD
><TD
WIDTH="80%"
ALIGN="center"
VALIGN="bottom"
></TD
><TD
WIDTH="10%"
ALIGN="right"
VALIGN="bottom"
><A
HREF="sec-sequences.html"
ACCESSKEY="N"
>Next</A
></TD
></TR
></TABLE
><HR
ALIGN="LEFT"
WIDTH="100%"></DIV
><DIV
CLASS="CHAPTER"
><H1
><A
NAME="SEC-USING"
>Chapter 2. Using <SPAN
CLASS="APPLICATION"
>Happy</SPAN
></A
></H1
><DIV
CLASS="TOC"
><DL
><DT
><B
>Table of Contents</B
></DT
><DT
>2.1. <A
HREF="sec-using.html#SEC-OTHER-DATATYPES"
>Returning other datatypes</A
></DT
><DT
>2.2. <A
HREF="sec-sequences.html"
>Parsing sequences</A
></DT
><DT
>2.3. <A
HREF="sec-precedences.html"
>Using Precedences</A
></DT
><DT
>2.4. <A
HREF="sec-type-signatures.html"
>Type Signatures</A
></DT
><DT
>2.5. <A
HREF="sec-monads.html"
>Monadic Parsers</A
></DT
><DT
>2.6. <A
HREF="sec-error.html"
>The Error Token</A
></DT
><DT
>2.7. <A
HREF="sec-multiple-parsers.html"
>Generating Multiple Parsers From a Single Grammar</A
></DT
></DL
></DIV
><P
> Users of <SPAN
CLASS="APPLICATION"
>Yacc</SPAN
> will find
<SPAN
CLASS="APPLICATION"
>Happy</SPAN
> quite familiar. The basic idea is
as follows: </P
><P
></P
><UL
><LI
><P
>Define the grammar you want to parse in a
<SPAN
CLASS="APPLICATION"
>Happy</SPAN
> grammar file. </P
></LI
><LI
><P
> Run the grammar through <SPAN
CLASS="APPLICATION"
>Happy</SPAN
>, to generate
a compilable Haskell module.</P
></LI
><LI
><P
> Use this module as part of your Haskell program, usually
in conjunction with a lexical analyser (a function that splits
the input into ``tokens'', the basic unit of parsing).</P
></LI
></UL
><P
> Let's run through an example. We'll implement a parser for a
simple expression syntax, consisting of integers, variables, the
operators <TT
CLASS="LITERAL"
>+</TT
>, <TT
CLASS="LITERAL"
>-</TT
>, <TT
CLASS="LITERAL"
>*</TT
>,
<TT
CLASS="LITERAL"
>/</TT
>, and the form <TT
CLASS="LITERAL"
>let var = exp in exp</TT
>.
The grammar file starts off like this:</P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><PRE
CLASS="PROGRAMLISTING"
>{
module Main where
}</PRE
></TD
></TR
></TABLE
><P
>At the top of the file is an optional <I
CLASS="FIRSTTERM"
>module
header</I
>,
which is just a Haskell module header enclosed in braces. This
code is emitted verbatim into the generated module, so you can put
any Haskell code here at all. In a grammar file, Haskell code is
always contained between curly braces to distinguish it from the
grammar.</P
><P
>In this case, the parser will be a standalone program so
we'll call the module <TT
CLASS="LITERAL"
>Main</TT
>.</P
><P
>Next comes a couple of declarations:</P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><PRE
CLASS="PROGRAMLISTING"
>%name calc
%tokentype { Token }</PRE
></TD
></TR
></TABLE
><P
>The first line declares the name of the parsing function
that <SPAN
CLASS="APPLICATION"
>Happy</SPAN
> will generate, in this case
<TT
CLASS="LITERAL"
>calc</TT
>. In many cases, this is the only symbol you need
to export from the module.</P
><P
>The second line declares the type of tokens that the parser
will accept. The parser (i.e. the function
<TT
CLASS="FUNCTION"
>calc</TT
>) will be of type <TT
CLASS="LITERAL"
>[Token] ->
T</TT
>, where <TT
CLASS="LITERAL"
>T</TT
> is the return type of the
parser, determined by the production rules below.</P
><P
>Now we declare all the possible tokens:</P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><PRE
CLASS="PROGRAMLISTING"
>%token
let { TokenLet }
in { TokenIn }
int { TokenInt $$ }
var { TokenVar $$ }
'=' { TokenEq }
'+' { TokenPlus }
'-' { TokenMinus }
'*' { TokenTimes }
'/' { TokenDiv }
'(' { TokenOB }
')' { TokenCB }</PRE
></TD
></TR
></TABLE
><P
>The symbols on the left are the tokens as they will be
referred to in the rest of the grammar, and to the right of each
token enclosed in braces is a Haskell pattern that matches the
token. The parser will expect to receive a stream of tokens, each
of which will match one of the given patterns (the definition of
the <TT
CLASS="LITERAL"
>Token</TT
> datatype is given later).</P
><P
>The <TT
CLASS="LITERAL"
>$$</TT
> symbol is a placeholder that
represents the <SPAN
CLASS="emphasis"
><I
CLASS="EMPHASIS"
>value</I
></SPAN
> of this token. Normally the value
of a token is the token itself, but by using the
<TT
CLASS="LITERAL"
>$$</TT
> symbol you can specify some component
of the token object to be the value. </P
><P
>Like yacc, we include <TT
CLASS="LITERAL"
>%%</TT
> here, for no real
reason.</P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><PRE
CLASS="PROGRAMLISTING"
>%%</PRE
></TD
></TR
></TABLE
><P
>Now we have the production rules for the grammar.</P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><PRE
CLASS="PROGRAMLISTING"
>Exp : let var '=' Exp in Exp { Let $2 $4 $6 }
| Exp1 { Exp1 $1 }
Exp1 : Exp1 '+' Term { Plus $1 $3 }
| Exp1 '-' Term { Minus $1 $3 }
| Term { Term $1 }
Term : Term '*' Factor { Times $1 $3 }
| Term '/' Factor { Div $1 $3 }
| Factor { Factor $1 }
Factor
: int { Int $1 }
| var { Var $1 }
| '(' Exp ')' { Brack $2 }</PRE
></TD
></TR
></TABLE
><P
>Each production consists of a <I
CLASS="FIRSTTERM"
>non-terminal</I
>
symbol on the left, followed by a colon, followed by one or more
expansions on the right, separated by <TT
CLASS="LITERAL"
>|</TT
>. Each expansion
has some Haskell code associated with it, enclosed in braces as
usual.</P
><P
>The way to think about a parser is with each symbol having a
`value': we defined the values of the tokens above, and the
grammar defines the values of non-terminal symbols in terms of
sequences of other symbols (either tokens or non-terminals). In a
production like this:</P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><PRE
CLASS="PROGRAMLISTING"
>n : t_1 ... t_n { E }</PRE
></TD
></TR
></TABLE
><P
>whenever the parser finds the symbols <TT
CLASS="LITERAL"
>t_1..t_n</TT
> in
the token stream, it constructs the symbol <TT
CLASS="LITERAL"
>n</TT
> and gives
it the value <TT
CLASS="LITERAL"
?? 快捷鍵說明
復制代碼
Ctrl + C
搜索代碼
Ctrl + F
全屏模式
F11
切換主題
Ctrl + Shift + D
顯示快捷鍵
?
增大字號
Ctrl + =
減小字號
Ctrl + -