?? ch05_02.htm

?? 編程珍珠,里面很多好用的代碼,大家可以參考學(xué)習(xí)呵呵,
?? HTM
?? 第 1 頁(yè) / 共 4 頁(yè)
字號(hào):
We'll discuss the individual pattern-matching operators in a moment,but first we'd like to mention another thing they all have in common,<em class="emphasis">modifiers</em>.</p><p><a name="INDEX-1336"></a><a name="INDEX-1337"></a><a name="INDEX-1338"></a><a name="INDEX-1339"></a>Immediately following the final delimiter of an<tt class="literal">m//</tt>, <tt class="literal">s///</tt>,<tt class="literal">qr//</tt>, or <tt class="literal">tr///</tt> operator, you mayoptionally place one or more single-letter modifiers, in any order.For clarity, modifiers are usually written as "the<tt class="literal">/o</tt> modifier" and pronounced "the slash ohmodifier", even though the final delimiter might be something other than aslash.  (Sometimes people say "flag" or "option" to mean "modifier";that's okay too.)<a name="INDEX-1340"></a><a name="INDEX-1341"></a></p><p>Some modifiers change the behavior of the individual operator, so we'lldescribe those in detail later.  Others change how the regex isinterpreted, so we'll talk about them here.  The <tt class="literal">m//</tt>, <tt class="literal">s///</tt>, and<tt class="literal">qr//</tt> operators<a href="#FOOTNOTE-5">[5]</a> all accept thefollowing modifiers after their final delimiter:</p><blockquote class="footnote"><a name="FOOTNOTE-5"></a><p>[5] The <tt class="literal">tr///</tt> operator does not takeregexes, so these modifiers do not apply.</p></blockquote><a name="perl3-tab-patmods"></a><table border="1"><tr><th>Modifier</th><th>Meaning</th></tr><tr><td><tt class="literal">/i</tt></td><td>Ignore alphabetic case distinctions (case insensitive).<a name="INDEX-1342"></a><a name="INDEX-1343"></a></td></tr><tr><td><tt class="literal">/s</tt></td><td>Let <tt class="literal">.</tt> match newline and ignore deprecated <tt class="literal">$*</tt> variable.<a name="INDEX-1344"></a></td></tr><tr><td><tt class="literal">/m</tt></td><td>Let <tt class="literal">^</tt> and <tt class="literal">$</tt> match next to embedded <tt class="literal">\n</tt>.<a name="INDEX-1345"></a></td></tr><tr><td><tt class="literal">/x</tt></td><td>Ignore (most) whitespace and permit comments in pattern.<a name="INDEX-1346"></a></td></tr><tr><td><tt class="literal">/o</tt></td><td>Compile pattern once only.<a name="INDEX-1347"></a></td></tr></table><p><a name="INDEX-1348"></a>The <tt class="literal">/i</tt> modifier says to match both upper- and lowercase (and titlecase, under Unicode).  That way <tt class="literal">/perl/i</tt> would also match the strings"<tt class="literal">PROPERLY</tt>" or "<tt class="literal">Perlaceous</tt>" (amongst other things).  A <tt class="literal">use locale</tt>pragma may also have some influence on what is considered to be equivalent.(This may be a negative influence on strings containing Unicode.)</p><p><a name="INDEX-1349"></a><a name="INDEX-1350"></a>The <tt class="literal">/s</tt> and <tt class="literal">/m</tt> modifiers don'tinvolve anything kinky.  Rather, they affect how Perl treats matchesagainst a string that contains newlines.  But they aren't aboutwhether your string actually contains newlines; they're about whetherPerl should <em class="emphasis">assume</em> that your string contains asingle line (<tt class="literal">/s</tt>) or multiple lines(<tt class="literal">/m</tt>), because certain metacharacters workdifferently depending on whether they're expected to behave in aline-oriented fashion or not.</p><p><a name="INDEX-1351"></a>Ordinarily, the metacharacter "<tt class="literal">.</tt>" matches any onecharacter <em class="emphasis">except</em> a newline, because itstraditional meaning is to match characters within a line.  With<tt class="literal">/s</tt>, however, the "<tt class="literal">.</tt>"metacharacter can also match a newline, because you've told Perl toignore the fact that the string might contain multiple newlines.  (The<tt class="literal">/s</tt> modifier also makes Perl ignore the deprecated<tt class="literal">$*</tt> variable, which we hope you too have beenignoring.)  The <tt class="literal">/m</tt> modifier, on the other hand,changes the interpretation of the <tt class="literal">^</tt> and<tt class="literal">$</tt> metacharacters by letting them match next tonewlines within the string instead of considering only the ends of thestring.  See the examples in the section <a href="ch05_06.htm#ch05-sect-posit">Section 5.6, "Positions"</a> later in thischapter.</p><p><a name="INDEX-1352"></a><a name="INDEX-1353"></a><a name="INDEX-1354"></a>The <tt class="literal">/o</tt> modifier controls pattern recompilation.Unless the delimiters chosen are single quotes(<tt class="literal">m'</tt><em class="replaceable">PATTERN</em><tt class="literal">'</tt>,<tt class="literal">s'</tt><em class="replaceable">PATTERN</em><tt class="literal">'</tt><em class="replaceable">REPLACEMENT</em><tt class="literal">'</tt>,or<tt class="literal">qr'</tt><em class="replaceable">PATTERN</em><tt class="literal">'</tt>),any variables in the pattern will be interpolated (and may cause thepattern to be recompiled) every time the pattern operator isevaluated.  If you want such a pattern to be compiled once and onlyonce, use the <tt class="literal">/o</tt> modifier.  This prevents expensiverun-time recompilations; it's useful when the value you areinterpolating won't change during execution.  However, mentioning<tt class="literal">/o</tt> constitutes a promise that you won't change thevariables in the pattern.  If you do change them, Perl won't evennotice.  For better control over recompilation, use the<tt class="literal">qr//</tt> regex quoting operator.  See "VariableInterpolation" later in this chapter for details.</p><p><a name="INDEX-1355"></a><a name="INDEX-1356"></a>The <tt class="literal">/x</tt> is the <em class="emphasis">ex</em>pressivemodifier: it allows you to <em class="emphasis">ex</em>ploit whitespace and<em class="emphasis">ex</em>planatory comments in order to<em class="emphasis">ex</em>pand your pattern's legibility, even<em class="emphasis">ex</em>tending the pattern across newlineboundaries.</p><p><a name="INDEX-1357"></a><a name="INDEX-1358"></a>Er, that is to say, <tt class="literal">/x</tt> modifies the meaning of thewhitespace characters (and the <tt class="literal">#</tt> character):instead of letting them do self-matching as ordinary characters do, itturns them into metacharacters that, oddly, now behave as whitespace(and comment characters) should.  Hence, <tt class="literal">/x</tt> allowsspaces, tabs, and newlines for formatting, just like regular Perlcode.  It also allows the <tt class="literal">#</tt> character, not normallyspecial in a pattern, to introduce a comment that extends through theend of the current line within the pattern string.<a href="#FOOTNOTE-6">[6]</a> If you want to match a real whitespacecharacter (or the <tt class="literal">#</tt> character), then you'll have toput it into a character class, or escape it with a backslash, orencode it using an octal or hex escape.  (But whitespace is normallymatched with a <tt class="literal">\s*</tt> or <tt class="literal">\s+</tt>sequence, so the situation doesn't arise often inpractice.)</p><blockquote class="footnote"><a name="FOOTNOTE-6"></a><p>[6] Becareful not to include the pattern delimiter in the comment--becauseof its "find the end first" rule, Perl has no way of knowing youdidn't intend to terminate the pattern at thatpoint.</p></blockquote><p>Taken together, these features go a long way toward making traditionalregular expressions a readable language.  In the spirit of TMTOWTDI,there's now more than one way to write a given regular expression.  Infact, there's more than two ways:<blockquote><pre class="programlisting">m/\w+:(\s+\w+)\s*\d+/;       # A word, colon, space, word, space, digits.m/\w+: (\s+ \w+) \s* \d+/x;  # A word, colon, space, word, space, digits.m{    \w+:                     # Match a word and a colon.    (                        # (begin group)         \s+                 # Match one or more spaces.         \w+                 # Match another word.    )                        # (end group)    \s*                      # Match zero or more spaces.    \d+                      # Match some digits}x;</pre></blockquote><a name="INDEX-1359"></a>We'll explain those new metasymbols later in the chapter.  (Thissection was supposed to be about pattern modifiers, but we've let itget out of hand in our <em class="emphasis">ex</em>citement about <tt class="literal">/x</tt>.  Ah well.)  Here's aregular expression that finds duplicate words in paragraphs, stolenright out of the <em class="citetitle">Perl Cookbook</em>.  It uses the <tt class="literal">/x</tt> and <tt class="literal">/i</tt>modifiers, as well as the <tt class="literal">/g</tt> modifier described later.<blockquote><pre class="programlisting"># Find duplicate words in paragraphs, possibly spanning line boundaries.#   Use /x for space and comments, /i to match both `is'#   in "Is is this ok?", and use /g to find all dups.$/ = "";        # "paragrep" modewhile (&lt;&gt;) {    while ( m{                \b            # start at a word boundary                (\w\S+)       # find a wordish chunk                (                    \s+       # separated by some whitespace                    \1        # and that chunk again                ) +           # repeat ad lib                \b            # until another word boundary             }xig         )    {        print "dup word '$1' at paragraph $.\n";    }}</pre></blockquote>When run on this chapter, it produces warnings like this:<blockquote><pre class="programlisting">dup word 'that' at paragraph 100</pre></blockquote>As it happens, we know that that particular instance was intentional.</p><a name="INDEX-1360"></a><a name="INDEX-1361"></a><a name="INDEX-1362"></a><h3 class="sect2">5.2.2. The m// Operator (Matching)</h3><p><a name="INDEX-1363"></a><blockquote><pre class="programlisting"><em class="replaceable">EXPR</em> =~ m/<em class="replaceable">PATTERN</em>/cgimosx<em class="replaceable">EXPR</em> =~ /<em class="replaceable">PATTERN</em>/cgimosx<em class="replaceable">EXPR</em> =~ ?<em class="replaceable">PATTERN</em>?cgimosxm/<em class="replaceable">PATTERN</em>/cgimosx/<em class="replaceable">PATTERN</em>/cgimosx?<em class="replaceable">PATTERN</em>?cgimosx</pre></blockquote><a name="INDEX-1364"></a>The <tt class="literal">m//</tt> operator searches the string in the scalar <em class="replaceable">EXPR</em> for<em class="replaceable">PATTERN</em>.  If <tt class="literal">/</tt> or <tt class="literal">?</tt> is the delimiter, the initial <tt class="literal">m</tt> isoptional.  Both <tt class="literal">?</tt> and <tt class="literal">'</tt> have special meanings as delimiters: thefirst is a once-only match; the second suppresses variableinterpolation and the six translation escapes (<tt class="literal">\U</tt> and company,described later).</p><p><a name="INDEX-1365"></a>If <em class="replaceable">PATTERN</em> evaluates to a null string,either because you specified it that way using <tt class="literal">//</tt>or because an interpolated variable evaluated to the empty string, thelast successfully executed regular expression not hidden within aninner block (or within a <tt class="literal">split</tt>,<tt class="literal">grep</tt>, or <tt class="literal">map</tt>) is used instead.</p><p><a name="INDEX-1366"></a>In scalar context, the operator returns true (<tt class="literal">1</tt>) if successful,false (<tt class="literal">""</tt>) otherwise.  This form is usually seen in Boolean context:<blockquote><pre class="programlisting">if ($shire =~ m/Baggins/) { ... }  # search for Baggins in $shireif ($shire =~ /Baggins/)  { ... }  # search for Baggins in $shireif ( m#Baggins# )         { ... }  # search right here in $_if ( /Baggins/ )          { ... }  # search right here in $_</pre></blockquote><a name="INDEX-1367"></a><a name="INDEX-1368"></a>Used in list context, <tt class="literal">m//</tt> returns a list ofsubstrings matched by the capturing parentheses in the pattern (thatis, <tt class="literal">$1</tt>, <tt class="literal">$2</tt>,<tt class="literal">$3</tt>, and so on) as described later under "Capturingand Clustering".  The numbered variables are still set even when thelist is returned.  If the match fails in list context, a null list isreturned.  If the match succeeds in list context but there were nocapturing parentheses (nor <tt class="literal">/g</tt>), a list value of<tt class="literal">(1)</tt> is returned.  Since it returns a null list onfailure, this form of <tt class="literal">m//</tt> can also be used inBoolean context, but only when participating indirectly via a listassignment:<blockquote><pre class="programlisting">if (($key,$value) = /(\w+): (.*)/) { ... }</pre></blockquote>Valid modifiers for <tt class="literal">m//</tt> (in whatever guise) areshown in <a href="ch05_02.htm#perl3-tab-mmods">Table 5-1</a>.<a name="INDEX-1369"></a><a name="INDEX-1370"></a></p><a name="perl3-tab-mmods"></a><h4 class="objtitle">Table 5.1. m// Modifiers</h4><table border="1"><tr><th>Modifier</th><th>Meaning</th></tr><tr><td><tt class="literal">/i</tt><a name="INDEX-1371"></a></td><td>Ignore alphabetic case.</td></tr><tr><td><tt class="literal">/m</tt><a name="INDEX-1372"></a></td><td>Let <tt class="literal">^</tt> and <tt class="literal">$</tt> match next to embedded <tt class="literal">\n</tt>.<a name="INDEX-1373"></a><a name="INDEX-1374"></a></td></tr><tr><td><tt class="literal">/s</tt></td><td>Let <tt class="literal">.</tt> match newline and ignore deprecated <tt class="literal">$*</tt>.<a name="INDEX-1375"></a></td></tr>
?? 文件大小 1969 K
?? 上傳用戶 ccuading
?? 所屬分類(lèi) 電子書(shū)籍
??? 相關(guān)標(biāo)簽

#編程 #代碼 #家
?? 快捷鍵說(shuō)明

復(fù)制代碼 Ctrl + C
搜索代碼 Ctrl + F
全屏模式 F11
切換主題 Ctrl + Shift + D
顯示快捷鍵 ?
增大字號(hào) Ctrl + =
減小字號(hào) Ctrl + -
亚洲欧美第一页_禁久久精品乱码_粉嫩av一区二区三区免费野_久草精品视频

?? ch05_02.htm

?? 快捷鍵說(shuō)明