?? java.regex.tutorial.html

?? Regular Expressions of Java Tutorial
?? HTML
?? 第 1 頁 / 共 5 頁
字號:
      <td>侵占</td>
    </tr>
  </thead>
  <tr>
    <td class="regCenter"><code>X?</code></td>
    <td class="regCenter"><code>X??</code></td>
    <td class="regCenter"><code>X?+</code></td>
    <td>匹配 X 零次或一次</td>
  </tr>
  <tr>
    <td class="regCenter"><code>X*</code></td>
    <td class="regCenter"><code>X*?</code></td>
    <td class="regCenter"><code>X*+</code></td>
    <td>匹配 X 零次或多次</td>
  </tr>
  <tr>
    <td class="regCenter"><code>X+</code></td>
    <td class="regCenter"><code>X+?</code></td>
    <td class="regCenter"><code>X++</code></td>
    <td>匹配 X 一次或多次</td>
  </tr>
  <tr>
    <td class="regCenter"><code>X{n}</code></td>
    <td class="regCenter"><code>X{n}?</code></td>
    <td class="regCenter"><code>X{n}+</code></td>
    <td>匹配 X n 次</td>
  </tr>
  <tr>
    <td class="regCenter"><code>X{n,}</code></td>
    <td class="regCenter"><code>X{n,}?</code></td>
    <td class="regCenter"><code>X{n,}+</code></td>
    <td>匹配 X 至少 n 次</td>
  </tr>
  <tr>
    <td class="regCenter"><code>X{n,m}</code></td>
    <td class="regCenter"><code>X{n,m}?</code></td>
    <td class="regCenter"><code>X{n,m}+</code></td>
    <td>匹配 X 至少 n 次，但不多于 m 次</td>
  </tr>
</table>
　　那我們現在就從貪婪量詞開始，構建三個不同的正則表達式：字母<code>a</code>后面跟著<code>?</code>、<code>*</code>和<code>+</code>。接下來看一下，用這些表達式來測試輸入的字符串是空字符串時會發生些什么：<br/>

<pre id="console">Enter your regex: a?
Enter input string to search: 
I found the text "" starting at index 0 and ending at index 0.

Enter your regex: a*
Enter input string to search: 
I found the text "" starting at index 0 and ending at index 0.

Enter your regex: a+
Enter input string to search: 
No match found.</pre>

<div id="h3"><a name="reg5_1"></a>5.1　零長度匹配<span class="returnContents"><a href="#contents">返回目錄</a></span></div>
　　在上面的例子中，開始的兩個匹配是成功的，這是因為表達式<code>a?</code>和<code>a*</code>都允許字符出現零次。就目前而言，這個例子不像其他的，也許你注意到了開始和結束的索引都是 0。輸入的空字符串沒有長度，因此該測試簡單地在索引 0 上匹配什么都沒有，諸如此類的匹配稱之為<em>零長度匹配</em>（zero-length matches）。零長度匹配會出現在以下幾種情況：輸入空的字符串、在輸入字符串的開始處、在輸入字符串最后字符的后面，或者是輸入字符串中任意兩個字符之間。由于它們開始和結束的位置有著相同的索引，因此零長度匹配是容易被發現的。<br/>

　　我們來看一下關于零長度匹配更多的例子。把輸入的字符串改為單個字符“a”，你會注意到一些有意思的事情：<br/>

<pre id="console">Enter your regex: a?
Enter input string to search: a
I found the text "a" starting at index 0 and ending at index 1.
I found the text "" starting at index 1 and ending at index 1.

Enter your regex: a*
Enter input string to search: a
I found the text "a" starting at index 0 and ending at index 1.
I found the text "" starting at index 1 and ending at index 1.

Enter your regex: a+
Enter input string to search: a
I found the text "a" starting at index 0 and ending at index 1.</pre>

　　所有的三個量詞都是用來尋找字母“a”的，但是前面兩個在索引 1 處找到了零長度匹配，也就是說，在輸入字符串最后一個字符的后面。回想一下，匹配把字符“a”看作是位于索引 0 和索引 1 之間的單元格中，并且測試用具一直循環下去直到不再有匹配為止。依賴于所使用的量詞不同，最后字符后面的索引“什么也沒有”的存在可以或者不可以觸發一個匹配。<br/>

　　現在把輸入的字符串改為一行 5 個“a”時，會得到下面的結果：<br/>

<pre id="console">Enter your regex: a?
Enter input string to search: aaaaa
I found the text "a" starting at index 0 and ending at index 1.
I found the text "a" starting at index 1 and ending at index 2.
I found the text "a" starting at index 2 and ending at index 3.
I found the text "a" starting at index 3 and ending at index 4.
I found the text "a" starting at index 4 and ending at index 5.
I found the text "" starting at index 5 and ending at index 5.

Enter your regex: a*
Enter input string to search: aaaaa
I found the text "aaaaa" starting at index 0 and ending at index 5.
I found the text "" starting at index 5 and ending at index 5.

Enter your regex: a+
Enter input string to search: aaaaa
I found the text "aaaaa" starting at index 0 and ending at index 5.</pre>

　　在“a”出現零次或一次時，表達式<code>a?</code>尋找到所匹配的每一個字符。表達式<code>a*</code>找到了兩個單獨的匹配：第一次匹配到所有的字母“a”，然后是匹配到最后一個字符后面的索引 5。最后，<code>a+</code>匹配了所有出現的字母“a”，忽略了在最后索引處“什么都沒有”的存在。<br/>

　　在這里，你也許會感到疑惑，開始的兩個量詞在遇到除了“a”的字母時會有什么結果。例如，在“ababaaaab”中遇到了字母“b”會發生什么呢？<br/>

　　下面我們來看一下：<br/>

<pre id="console">Enter your regex: a?
Enter input string to search: ababaaaab
I found the text "a" starting at index 0 and ending at index 1.
I found the text "" starting at index 1 and ending at index 1.
I found the text "a" starting at index 2 and ending at index 3.
I found the text "" starting at index 3 and ending at index 3.
I found the text "a" starting at index 4 and ending at index 5.
I found the text "a" starting at index 5 and ending at index 6.
I found the text "a" starting at index 6 and ending at index 7.
I found the text "a" starting at index 7 and ending at index 8.
I found the text "" starting at index 8 and ending at index 8.
I found the text "" starting at index 9 and ending at index 9.

Enter your regex: a*
Enter input string to search: ababaaaab
I found the text "a" starting at index 0 and ending at index 1.
I found the text "" starting at index 1 and ending at index 1.
I found the text "a" starting at index 2 and ending at index 3.
I found the text "" starting at index 3 and ending at index 3.
I found the text "aaaa" starting at index 4 and ending at index 8.
I found the text "" starting at index 8 and ending at index 8.
I found the text "" starting at index 9 and ending at index 9.

Enter your regex: a+
Enter input string to search: ababaaaab
I found the text "a" starting at index 0 and ending at index 1.
I found the text "a" starting at index 2 and ending at index 3.
I found the text "aaaa" starting at index 4 and ending at index 8.</pre>

　　即使字母“b”在單元格 1、3、8 中出現，但在這些位置上的輸出報告了零長度匹配。正則表達式<code>a?</code>不是特意地去尋找字母“b”，它僅僅是去找字母“a”存在或者其中缺少的。如果量詞允許匹配“a”零次，任何輸入的字符不是“a”時將會作為零長度匹配。在前面的例子中，根據討論的規則保證了 a 被匹配。<br/>

　　對于要精確地匹配一個模式 n 次時，可以簡單地在一對花括號內指定一個數值：<br/>

<pre id="console">Enter your regex: a{3}
Enter input string to search: aa
No match found.

Enter your regex: a{3}
Enter input string to search: aaa
I found the text "aaa" starting at index 0 and ending at index 3.

Enter your regex: a{3}
Enter input string to search: aaaa
I found the text "aaa" starting at index 0 and ending at index 3.</pre>

　　這里，正則表確定式<code>a{3}</code>在一行中尋找連續出現三次的字母“a”。第一次測試失敗的原由在于，輸入的字符串沒有足夠的 a 用來匹配；第二次測試輸出的字符串正好包括了三個“a”，觸發了一次匹配；第三次測試也觸發了一次匹配，這是由于在輸出的字符串的開始部分正好有三個“a”。接下來的事情與第一次的匹配是不相關的，如果這個模式將在這一點后繼續出現，那它將會觸發接下來的匹配：

<pre id="console">Enter your regex: a{3}
Enter input string to search: aaaaaaaaa
I found the text "aaa" starting at index 0 and ending at index 3.
I found the text "aaa" starting at index 3 and ending at index 6.
I found the text "aaa" starting at index 6 and ending at index 9.</pre>

　　對于需要一個模式出現至少 n 次時，可以在這個數字后面加上一個逗號（<code>,</code>）：

<pre id="console">Enter your regex: a{3,}
Enter input string to search: aaaaaaaaa
I found the text "aaaaaaaaa" starting at index 0 and ending at index 9.</pre>

　　輸入一樣的字符串，這次測試僅僅找到了一個匹配，這是由于一個中有九個“a”滿足了“至少”三個“a”的要求。<br/>

　　最后，對于指定出現次數的上限，可以在花括號添加第二個數字。<br/>
 
<pre id="console">Enter your regex: a{3,6} // 尋找一行中至少連續出現 3 個（但不多于 6 個）“a”
Enter input string to search: aaaaaaaaa
I found the text "aaaaaa" starting at index 0 and ending at index 6.
I found the text "aaa" starting at index 6 and ending at index 9.</pre>

　　這里，第一次匹配在 6 個字符的上限時被迫終止了。第二個匹配包含了剩余的三個 a（這是匹配所允許最小的字符個數）。如果輸入的字符串再少掉一個字母，這時將不會有第二個匹配，之后僅剩余兩個 a。

<div id="h3"><a name="reg5_2"></a>5.2　捕獲組和字符類中的量詞<span class="returnContents"><a href="#contents">返回目錄</a></span></div>
　　到目前為止，僅僅測試了輸入的字符串包括一個字符的量詞。實際上，量詞僅僅可能附在一個字符后面一次，因此正則表達式<code>abc+</code>的意思就是“a 后面接著 b，再接著一次或者多次的 c”，它的意思并不是指<code>abc</code>一次或者多次。然而，量詞也可能附在字符類和捕獲組的后面，比如，<code>[abc]+</code>表示一次或者多次的 a 或 b 或 c，<code>(abc)+</code>表示一次或者多次的“abc”組。<br/>
　　我們來指定<code>(dog)</code>組在一行中三次進行說明。<br/>

<pre id="console">Enter your regex: (dog){3}
Enter input string to search: dogdogdogdogdogdog
I found the text "dogdogdog" starting at index 0 and ending at index 9.
I found the text "dogdogdog" starting at index 9 and ending at index 18.

Enter your regex: dog{3}
Enter input string to search: dogdogdogdogdogdog
No match found.</pre>

　　上面的第一個例子找到了三個匹配，這是由于量詞用在了整個捕獲組上。然而，把圓括號去掉，這時的量詞<code>{3}</code>現在僅用在了字母“g”上，從而導致這個匹配失敗。<br/>
　　類似地，也能把量詞應用于整個字符類：<br/>

<pre id="console">Enter your regex: [abc]{3}
Enter input string to search: abccabaaaccbbbc
I found the text "abc" starting at index 0 and ending at index 3.
I found the text "cab" starting at index 3 and ending at index 6.
I found the text "aaa" starting at index 6 and ending at index 9.
I found the text "ccb" starting at index 9 and ending at index 12.
I found the text "bbc" starting at index 12 and ending at index 15.

Enter your regex: abc{3}
Enter input string to search: abccabaaaccbbbc
No match found.</pre>

　　上面的第一個例子中，量詞<code>{3}</code>應用在了整個字符類上，但是第二個例子這個量詞僅用在字母“c”上。

<div id="h3"><a name="reg5_3"></a>5.3　貪婪、勉強和侵占量詞間的不同<span class="returnContents"><a href="#contents">返回目錄</a></span></div>
　　在貪婪、勉強和侵占三個量詞間有著細微的不同。<br/>
　　貪婪量詞之所以稱之為“貪婪的”，這是由于它們強迫匹配器讀入（或者稱之為吃掉）整個輸入的字符串，來優先嘗試第一次匹配，如果第一次嘗試匹配（對于整個輸入的字符串）失敗，匹配器會通過回退整個字符串的一個字符再一次進行嘗試，不斷地進行處理直到找到一個匹配，或者左邊沒有更多的字符來用于回退了。賴于在表達式中使用的量詞，最終它將嘗試地靠著 1 或 0 個字符的匹配。<br/>
　　但是，勉強量詞采用相反的途徑：從輸入字符串的開始處開始，因此每次勉強地吞噬一個字符來尋找匹配，最終它們會嘗試整個輸入的字符串。<br/>
　　最后，侵占量詞始終是吞掉整個輸入的字符串，嘗試著一次（僅有一次）匹配。不像貪婪量詞那樣，侵占量詞絕不會回退，即使這樣做是允許全部的匹配成功。<br/>
　　為了說明一下，看看輸入的字符串是 xfooxxxxxxfoo 時。<br/>

<pre id="console">Enter your regex: .*foo  // 貪婪量詞
Enter input string to search: xfooxxxxxxfoo
I found the text "xfooxxxxxxfoo" starting at index 0 and ending at index 13.
?? 文件大小 32 K
?? 上傳用戶 zhang8818200
?? 所屬分類 Java編程
??? 相關標簽

#Expressions #Tutorial #Regular #Java
?? 快捷鍵說明

復制代碼 Ctrl + C
搜索代碼 Ctrl + F
全屏模式 F11
切換主題 Ctrl + Shift + D
顯示快捷鍵 ?
增大字號 Ctrl + =
減小字號 Ctrl + -
亚洲欧美第一页_禁久久精品乱码_粉嫩av一区二区三区免费野_久草精品视频

?? java.regex.tutorial.html

?? 快捷鍵說明