师傅:
前面学习了字符组、排除型字符组、字符组简记法、括号、括号的多种用途、量词、以及锚点和环视结构的使用
,接下来介绍
正则表达式非常有用的功能:匹配模式。
徒弟:哎哟,不错哟!
匹配模式:
l
作用:改变某些结构的匹配规定
l
形式:
・I: Case Insensitive
・S: SingleLine(dot All)
・M:MultiLine
・X:Comment
I:不区分大小写的正则表达式匹配。
S:点号可以匹配任意字符,包括换行符
M:^和$可以匹配各个问本行的开始、和结束位置。
X:允许在复杂的正则表达式中添加注释。
I: 不区分大小写
l
作用:在匹配时,不对英文单词区分大小写
例子:
public
class
CaseInsensitive {
public
static
void
main(String[] args) {
String str =
"abc"
;
String regex =
"ABC"
;
Pattern p = Pattern.
compile
(regex);
Matcher m = p.matcher(str);
if
(m.find()){
System.
out
.println(str +
"能够匹配正则:"
+ regex);
}
else
{
System.
out
.println(str +
"不能够匹配正则:"
+ regex);
}
}
}
运行结果:
abc不能够匹配正则:ABC
默认情况下区分大小写,所以匹配无法成功。
下面指定不区分大小写模式:
Pattern p = Pattern.
compile
(regex,Pattern.
CASE_INSENSITIVE
);
运行结果:
abc能够匹配正则:ABC
int java.util.regex.
Pattern
.CASE_INSENSITIVE = 2 [0x2]
CASE_INSENSITIVE
public static final int
CASE_INSENSITIVE
Enables case-insensitive matching.
By default, case-insensitive matching assumes that only characters in the US-ASCII charset are being matched. Unicode-aware case-insensitive matching can be enabled by specifying the
UNICODE_CASE
flag in conjunction with this flag.
Case-insensitive matching can also be enabled via the embedded flag expression (?i).
Specifying this flag may impose a slight performance penalty.
See Also:
Constant Field Values
S: 单行模式
l
作用:更改点好.的匹配规定,点号也可以匹配换行符
之前在字符组简记法中指出:点号不能匹配换行符
这里指定模式S,则点号也可以匹配换行符。
例子:
public
class
DotMatchAll {
public
static
void
main(String[] args) {
String str =
"<a href=www.sina.com.cn>\nSINA\n</a>"
;
String regex =
"<a href.*</a>"
;
Pattern p = Pattern.
compile
(regex);
Matcher m = p.matcher(str);
if
(m.find()){
System.
out
.println(str +
"能够匹配正则:"
+ regex);
}
else
{
System.
out
.println(str +
"不能够匹配正则:"
+ regex);
}
}
}
运行结果:
<a href=www.sina.com.cn>
SINA
</a>不能够匹配正则:<a href.*</a>
\nSINA\n
这里使用了换行符。
用点星号来匹配
,发现:
</a>不能够匹配正则:<a href.*</a>
那么修改匹配模式:
Pattern
p = Pattern.
compile
(regex,Pattern.
DOTALL
);
运行结果:
<a href=www.sina.com.cn>
SINA
</a>能够匹配正则:<a href.*</a>
说明:
int java.util.regex.
Pattern
.DOTALL = 32 [0x20]
DOTALL
public static final int
DOTALL
Enables dotall mode.
In dotall mode, the expression . matches any character, including a line terminator. By default this expression does not match line terminators.
Dotall mode can also be enabled via the embedded flag expression (?s). (The s is a mnemonic for "single-line" mode, which is what this is called in Perl.)
See Also:
Constant Field Values