?? re2c.man
字號:
RE2C(1) RE2C(1)NNAAMMEE re2c - convert regular expressions to C/C++SSYYNNOOPPSSIISS rree22cc [--eessbb] _n_a_m_eDDEESSCCRRIIPPTTIIOONN rree22cc is a preprocessor that generates C-based recognizers from regular expressions. The input to rree22cc consists of C/C++ source interleaved with comments of the form //**!!rree22cc ... **// which contain scanner specifications. In the out- put these comments are replaced with code that, when exe- cuted, will find the next input token and then execute some user-supplied token-specific code. For example, given the following code #define NULL ((char*) 0) char *scan(char *p){ char *q; #define YYCTYPE char #define YYCURSOR p #define YYLIMIT p #define YYMARKER q #define YYFILL(n) /*!re2c [0-9]+ {return YYCURSOR;} [\000-\377] {return NULL;} */ } rree22cc will generate /* Generated by re2c on Sat Apr 16 11:40:58 1994 */ #line 1 "simple.re" #define NULL ((char*) 0) char *scan(char *p){ char *q; #define YYCTYPE char #define YYCURSOR p #define YYLIMIT p #define YYMARKER q #define YYFILL(n) { YYCTYPE yych; unsigned int yyaccept; goto yy0; yy1: ++YYCURSOR; yy0: if((YYLIMIT - YYCURSOR) < 2) YYFILL(2); yych = *YYCURSOR; if(yych <= '/') goto yy4;Version 0.5 8 April 1994 1RE2C(1) RE2C(1) if(yych >= ':') goto yy4; yy2: yych = *++YYCURSOR; goto yy7; yy3: #line 10 {return YYCURSOR;} yy4: yych = *++YYCURSOR; yy5: #line 11 {return NULL;} yy6: ++YYCURSOR; if(YYLIMIT == YYCURSOR) YYFILL(1); yych = *YYCURSOR; yy7: if(yych <= '/') goto yy3; if(yych <= '9') goto yy6; goto yy3; } #line 12 }OOPPTTIIOONNSS rree22cc provides the following options: --ee Cross-compile from an ASCII platform to an EBCDIC one. --ss Generate nested iiffs for some sswwiittcchhes. Many com- pilers need this assist to generate better code. --bb Implies --ss. Use bit vectors as well in the attempt to coax better code out of the compiler. Most use- ful for specifications with more than a few key- words (e.g. for most programming languages).IINNTTEERRFFAACCEE CCOODDEE Unlike other scanner generators, rree22cc does not generate complete scanners: the user must supply some interface code. In particular, the user must define the following macros: YYYYCCHHAARR Type used to hold an input symbol. Usually cchhaarr or uunnssiiggnneedd cchhaarr. YYYYCCUURRSSOORR _l-expression of type **YYYYCCHHAARR that points to the current input symbol. The generated code advances YYYYCCUURRSSOORR as symbols are matched. On entry, YYYYCCUURR-- SSOORR is assumed to point to the first character of the current token. On exit, YYYYCCUURRSSOORR will point to the first character of the following token.Version 0.5 8 April 1994 2RE2C(1) RE2C(1) YYLLIIMMIITT Expression of type **YYYYCCHHAARR that marks the end of the buffer (YYLLIIMMIITT[[--11]] is the last character in the buffer). The generated code repeatedly compares YYYYCCUURRSSOORR to YYLLIIMMIITT to determine when the buffer needs (re)filling. YYYYMMAARRKKEERR _l-expression of type **YYYYCCHHAARR. The generated code saves backtracking information in YYYYMMAARRKKEERR. YYYYFFIILLLL((_n)) The generated code "calls" YYYYFFIILLLL when the buffer needs (re)filling: at least _n additional charac- ters should be provided. YYYYFFIILLLL should adjust YYYYCCUURRSSOORR, YYYYLLIIMMIITT and YYYYMMAARRKKEERR as needed. Note that for typical programming languages _n will be the length of the longest keyword plus one.SSCCAANNNNEERR SSPPEECCIIFFIICCAATTIIOONNSS Each scanner specification consists of a set of _r_u_l_e_s and name definitions. Rules consist of a regular expression along with a block of C/C++ code that is to be executed when the associated regular expression is matched. Name definitions are of the form ``_n_a_m_e == _r_e_g_u_l_a_r _e_x_p_r_e_s_- _s_i_o_n;;''.SSUUMMMMAARRYY OOFF RREE22CC RREEGGUULLAARR EEXXPPRREESSSSIIOONNSS ""ffoooo"" the literal string ffoooo. ANSI-C escape sequences can be used. [[xxyyzz]] a "character class"; in this case, the regular expression matches either an 'xx', a 'yy', or a 'zz'. [[aabbjj--ooZZ]] a "character class" with a range in it; matches an 'aa', a 'bb', any letter from 'jj' through 'oo', or a 'ZZ'. _r\\_s match any _r which isn't an _s. _r and _s must be regu- lar expressions which can be expressed as character classes. _r** zero or more _r's, where _r is any regular expression _r++ one or more _r's _r?? zero or one _r's (that is, "an optional _r") name the expansion of the "name" definition (see above) ((_r)) an _r; parentheses are used to override precedence (see below)Version 0.5 8 April 1994 3RE2C(1) RE2C(1) _r_s an _r followed by an _s ("concatenation") _r||_s either an _r or an _s _r//_s an _r but only if it is followed by an _s. The s is not part of the matched text. This type of regular expression is called "trailing context". The regular expressions listed above are grouped according to precedence, from highest precedence at the top to low- est at the bottom. Those grouped together have equal precedence.AA LLAARRGGEERR EEXXAAMMPPLLEE #include <stdlib.h> #include <stdio.h> #include <fcntl.h> #include <string.h> #define ADDEQ 257 #define ANDAND 258 #define ANDEQ 259 #define ARRAY 260 #define ASM 261 #define AUTO 262 #define BREAK 263 #define CASE 264 #define CHAR 265 #define CONST 266 #define CONTINUE 267 #define DECR 268 #define DEFAULT 269 #define DEREF 270 #define DIVEQ 271 #define DO 272 #define DOUBLE 273 #define ELLIPSIS 274 #define ELSE 275 #define ENUM 276 #define EQL 277 #define EXTERN 278 #define FCON 279 #define FLOAT 280 #define FOR 281 #define FUNCTION 282 #define GEQ 283 #define GOTO 284 #define ICON 285 #define ID 286 #define IF 287 #define INCR 288 #define INT 289 #define LEQ 290Version 0.5 8 April 1994 4RE2C(1) RE2C(1) #define LONG 291 #define LSHIFT 292 #define LSHIFTEQ 293 #define MODEQ 294 #define MULEQ 295 #define NEQ 296 #define OREQ 297 #define OROR 298 #define POINTER 299 #define REGISTER 300 #define RETURN 301 #define RSHIFT 302 #define RSHIFTEQ 303 #define SCON 304 #define SHORT 305 #define SIGNED 306 #define SIZEOF 307 #define STATIC 308 #define STRUCT 309 #define SUBEQ 310 #define SWITCH 311 #define TYPEDEF 312 #define UNION 313 #define UNSIGNED 314 #define VOID 315 #define VOLATILE 316 #define WHILE 317 #define XOREQ 318 #define EOI 319 typedef unsigned int uint; typedef unsigned char uchar; #define BSIZE 8192 #define YYCTYPE uchar #define YYCURSOR cursor #define YYLIMIT s->lim #define YYMARKER s->ptr #define YYFILL(n) {cursor = fill(s, cursor);} #define RET(i) {s->cur = cursor; return i;} typedef struct Scanner { int fd; uchar *bot, *tok, *ptr, *cur, *pos, *lim, *top, *eof; uint line; } Scanner; uchar *fill(Scanner *s, uchar *cursor){ if(!s->eof){ uint cnt = s->tok - s->bot; if(cnt){ memcpy(s->bot, s->tok, s->lim - s->tok);Version 0.5 8 April 1994 5
?? 快捷鍵說明
復制代碼
Ctrl + C
搜索代碼
Ctrl + F
全屏模式
F11
切換主題
Ctrl + Shift + D
顯示快捷鍵
?
增大字號
Ctrl + =
減小字號
Ctrl + -