?? java.regex.tutorial.html
字號:
Enter input string to search: 5
No match found.
Enter your regex: [0-4[6-8]]
Enter input string to search: 6
I found the text "6" starting at index 0 and ending at index 1.
Enter your regex: [0-4[6-8]]
Enter input string to search: 8
I found the text "8" starting at index 0 and ending at index 1.
Enter your regex: [0-4[6-8]]
Enter input string to search: 9
No match found.</pre>
<div id="h4"><a name="reg3_1_4"></a>3.1.4 交集<span class="returnContents"><a href="#contents">返回目錄</a></span></div>
建一個僅僅匹配自身嵌套類中公共部分字符的字符類時,可以像<code>[0-9&&[345]]</code>中那樣使用<code>&&</code>。這種方式構建出來的<em>交集</em>(intersection)簡單字符類,僅僅以匹配兩個字符類中的 3,4,5 共有部分。
<pre id="console">Enter your regex: [0-9&&[345]]
Enter input string to search: 3
I found the text "3" starting at index 0 and ending at index 1.
Enter your regex: [0-9&&[345]]
Enter input string to search: 4
I found the text "4" starting at index 0 and ending at index 1.
Enter your regex: [0-9&&[345]]
Enter input string to search: 5
I found the text "5" starting at index 0 and ending at index 1.
Enter your regex: [0-9&&[345]]
Enter input string to search: 2
No match found.
Enter your regex: [0-9&&[345]]
Enter input string to search: 6
No match found.</pre>
下面演示兩個范圍交集的例子:
<pre id="console">Enter your regex: [2-8&&[4-6]]
Enter input string to search: 3
No match found.
Enter your regex: [2-8&&[4-6]]
Enter input string to search: 4
I found the text "4" starting at index 0 and ending at index 1.
Enter your regex: [2-8&&[4-6]]
Enter input string to search: 5
I found the text "5" starting at index 0 and ending at index 1.
Enter your regex: [2-8&&[4-6]]
Enter input string to search: 6
I found the text "6" starting at index 0 and ending at index 1.
Enter your regex: [2-8&&[4-6]]
Enter input string to search: 7
No match found.</pre>
<div id="h4"><a name="reg3_1_5"></a>3.1.5 差集<span class="returnContents"><a href="#contents">返回目錄</a></span></div>
最后,可以使用<em>差集</em>(subtraction)來否定一個或多個嵌套的字符類,比如:<code>[0-9&&[^345]]</code>,這個是構建一個匹配除 3,4,5 之外所有 0 到 9 間數字的簡單字符類。
<pre id="console">Enter your regex: [0-9&&[^345]]
Enter input string to search: 2
I found the text "2" starting at index 0 and ending at index 1.
Enter your regex: [0-9&&[^345]]
Enter input string to search: 3
No match found.
Enter your regex: [0-9&&[^345]]
Enter input string to search: 4
No match found.
Enter your regex: [0-9&&[^345]]
Enter input string to search: 5
No match found.
Enter your regex: [0-9&&[^345]]
Enter input string to search: 6
I found the text "6" starting at index 0 and ending at index 1.
Enter your regex: [0-9&&[^345]]
Enter input string to search: 9
I found the text "9" starting at index 0 and ending at index 1.</pre>
到此為止,已經涵蓋了如何建立字符類的部分。在繼續下一節之前,可以試著回想一下那張<a href="#fig1">字符類表</a>。
<div id="h2"><a name="reg4"></a>4 預定義字符類<span class="returnContents"><a href="#contents">返回目錄</a></span></div>
Pattern 的 API 包有許多有用的<em>預定義字符類</em>(predefined character classes),提供了常用正則表達式的簡寫形式。<br/>
<table border="0" cellpadding="0" cellspacing="0" class="regTab" align="center">
<caption>預定義字符類</caption>
<tr>
<td class="regCenter"><code>.</code></td>
<td>任何字符(匹配或者不匹配行結束符)</td>
</tr>
<tr>
<td class="regCenter"><code>\d</code></td>
<td>數字字符:<code>[0-9]</code></td>
</tr>
<tr>
<td class="regCenter"><code>\D</code></td>
<td>非數字字符:<code>[^0-9]</code></td>
</tr>
<tr>
<td class="regCenter"><code>\s</code></td>
<td>空白字符:<code>[\t\n\x0B\f\r]</code></td>
</tr>
<tr>
<td class="regCenter"><code>\S</code></td>
<td>非空白字符:<code>[^\s]</code></td>
</tr>
<tr>
<td class="regCenter"><code>\w</code></td>
<td>單詞字符:<code>[a-zA-Z_0-9]</code></td>
</tr>
<tr>
<td class="regCenter"><code>\W</code></td>
<td>非單詞字符:<code>[^\w]</code></td>
</tr>
</table>
上表中,左列是構造右列字符類的簡寫形式。例如:<code>\d</code>指的是數字范圍(0~9),<code>\w</code>指的是單詞字符(任何大小寫字母、下劃線或者是數字)。無論何時都有可能使用預定義字符類,它可以使代碼更易閱讀,更易從難看的字符類中排除錯誤。<br/>
以反斜線(<code>\</code>)開始的構造稱為<em>轉義構造</em>(escaped constructs)。回顧一下在 <a href="#reg2">字符串</a> 一節中的轉義構造,在那里我們提及了使用反斜線,以及用于引用的<code>\Q</code>和<code>\E</code>。在字符串中使用轉義構造,必須在一個反斜線前再增加一個反斜用于字符串的編譯,例如:<br/>
<pre name="java" id="java">private final String REGEX = "\\d"; // 單個數字</pre>
這個例子中<code>\d</code>是正則表達式,另外的那個反斜線是用于代碼編譯所必需的。但是測試用具讀取的表達式,是直接從控制臺中輸入的,因此不需要那個多出來的反斜線。<br/>
下面的例子說明了預字義字符類的用法:<br/>
<pre id="console">Enter your regex: .
Enter input string to search: @
I found the text "@" starting at index 0 and ending at index 1.
Enter your regex: .
Enter input string to search: 1
I found the text "1" starting at index 0 and ending at index 1.
Enter your regex: .
Enter input string to search: a
I found the text "a" starting at index 0 and ending at index 1.
Enter your regex: \d
Enter input string to search: 1
I found the text "1" starting at index 0 and ending at index 1.
Enter your regex: \d
Enter input string to search: a
No match found.
Enter your regex: \D
Enter input string to search: 1
No match found.
Enter your regex: \D
Enter input string to search: a
I found the text "a" starting at index 0 and ending at index 1.
Enter your regex: \s
Enter input string to search:
I found the text " " starting at index 0 and ending at index 1.
Enter your regex: \s
Enter input string to search: a
No match found.
Enter your regex: \S
Enter input string to search:
No match found.
Enter your regex: \S
Enter input string to search: a
I found the text "a" starting at index 0 and ending at index 1.
Enter your regex: \w
Enter input string to search: a
I found the text "a" starting at index 0 and ending at index 1.
Enter your regex: \w
Enter input string to search: !
No match found.
Enter your regex: \W
Enter input string to search: a
No match found.
Enter your regex: \W
Enter input string to search: !
I found the text "!" starting at index 0 and ending at index 1.</pre>
在開始的三個例子中,正則表達式是簡單的,<code>.</code>(“點”元字符)表示“任意字符”,因此,在所有的三個例子(隨意地選取了“@”字符,數字和字母)中都是匹配成功的。在接下來的例子中,都使用了預定義字符類表格中的單個正則表達式構造。你應該可以根據這張表指出前面每個匹配的邏輯:<br/>
<code>\d</code> 匹配數字字符<br/>
<code>\s</code> 匹配空白字符<br/>
<code>\w</code> 匹配單詞字符<br/>
也可以使用意思正好相反的大寫字母:<br/>
<code>\D</code> 匹配非數字字符<br/>
<code>\S</code> 匹配非空白字符<br/>
<code>\W</code> 匹配非單詞字符<br/>
<div id="h2"><a name="reg5"></a>5 量詞<span class="returnContents"><a href="#contents">返回目錄</a></span></div>
這一節我們來看一下貪婪(greedy)、勉強(reluctant)和侵占(possessive)量詞,來匹配指定表達式<code>X</code>的次數。<br/>
<em>量詞</em>(quantifiers)允許指定匹配出現的次數,方便起見,當前 Pattern API 規范下,描述了貪婪、勉強和侵占三種量詞。首先粗略地看一下,量詞<code>X?</code>、<code>X??</code>和<code>X?+</code>都允許匹配 X 零次或一次,精確地做同樣的事情,但它們之間有著細微的不同之處,在這節結束前會進行說明。<br/>
<table border="0" cellpadding="0" cellspacing="0" class="regTab" align="center">
<thead>
<tr>
<td colspan="3">量 詞 種 類</td>
<td rowspan="2">意 義</td>
</tr>
<tr>
<td>貪婪</td>
<td>勉強</td>
?? 快捷鍵說明
復制代碼
Ctrl + C
搜索代碼
Ctrl + F
全屏模式
F11
切換主題
Ctrl + Shift + D
顯示快捷鍵
?
增大字號
Ctrl + =
減小字號
Ctrl + -