最近在看Java正则,java.util.regex包里最主要的是Pattern类和Matcher类。
pattern类的作用可以理解为将我们写的正则表达式字符串变为Java里的pattern类。
/**
* Compiles the given regular expression into a pattern with the given
* flags.
*
* @param regex
* The expression to be compiled
*
* @param flags
* Match flags, a bit mask that may include
* {@link #CASE_INSENSITIVE}, {@link #MULTILINE}, {@link #DOTALL},
* {@link #UNICODE_CASE}, {@link #CANON_EQ}, {@link #UNIX_LINES},
* {@link #LITERAL}, {@link #UNICODE_CHARACTER_CLASS}
* and {@link #COMMENTS}
*
* @return the given regular expression compiled into a pattern with the given flags
* @throws IllegalArgumentException
* If bit values other than those corresponding to the defined
* match flags are set in flags
*
* @throws PatternSyntaxException
* If the expression's syntax is invalid
*/
public static Pattern compile(String regex) {
return new Pattern(regex, 0);
}
/**
* Compiles the given regular expression into a pattern with the given
* flags.
*
* @param regex
* The expression to be compiled
*
* @param flags
* Match flags, a bit mask that may include
* {@link #CASE_INSENSITIVE}, {@link #MULTILINE}, {@link #DOTALL},
* {@link #UNICODE_CASE}, {@link #CANON_EQ}, {@link #UNIX_LINES},
* {@link #LITERAL}, {@link #UNICODE_CHARACTER_CLASS}
* and {@link #COMMENTS}
*
* @return the given regular expression compiled into a pattern with the given flags
* @throws IllegalArgumentException
* If bit values other than those corresponding to the defined
* match flags are set in flags
*
* @throws PatternSyntaxException
* If the expression's syntax is invalid
*/
public static Pattern compile(String regex, int flags) {
return new Pattern(regex, flags);
}
Pattern pattern = compile(String regex, int flag)
传入不同的flag可以控制不同的匹配行为。不同的flag对应的参数见 详情解析
/**
* Splits the given input sequence around matches of this pattern.
*
* The array returned by this method contains each substring of the
* input sequence that is terminated by another subsequence that matches
* this pattern or is terminated by the end of the input sequence. The
* substrings in the array are in the order in which they occur in the
* input. If this pattern does not match any subsequence of the input then
* the resulting array has just one element, namely the input sequence in
* string form.
*
*
When there is a positive-width match at the beginning of the input
* sequence then an empty leading substring is included at the beginning
* of the resulting array. A zero-width match at the beginning however
* never produces such empty leading substring.
*
*
The limit parameter controls the number of times the
* pattern is applied and therefore affects the length of the resulting
* array. If the limit n is greater than zero then the pattern
* will be applied at most n - 1 times, the array's
* length will be no greater than n, and the array's last entry
* will contain all input beyond the last matched delimiter. If n
* is non-positive then the pattern will be applied as many times as
* possible and the array can have any length. If n is zero then
* the pattern will be applied as many times as possible, the array can
* have any length, and trailing empty strings will be discarded.
*
*
The input "boo:and:foo", for example, yields the following
* results with these parameters:
*
*
* Regex
* Limit
* Result
* :
* 2
* { "boo", "and:foo" }
* :
* 5
* { "boo", "and", "foo" }
* :
* -2
* { "boo", "and", "foo" }
* o
* 5
* { "b", "", ":and:f", "", "" }
* o
* -2
* { "b", "", ":and:f", "", "" }
* o
* 0
* { "b", "", ":and:f" }
*
*
* @param input
* The character sequence to be split
*
* @param limit
* The result threshold, as described above
*
* @return The array of strings computed by splitting the input
* around matches of this pattern
*/
public String[] split(CharSequence input, int limit) {
int index = 0;
boolean matchLimited = limit > 0;
ArrayList matchList = new ArrayList<>();
Matcher m = matcher(input);
// Add segments before each match found
while(m.find()) {
if (!matchLimited || matchList.size() < limit - 1) {
if (index == 0 && index == m.start() && m.start() == m.end()) {
// no empty leading substring included for zero-width match
// at the beginning of the input char sequence.
continue;
}
String match = input.subSequence(index, m.start()).toString();
matchList.add(match);
index = m.end();
} else if (matchList.size() == limit - 1) { // last one
String match = input.subSequence(index,
input.length()).toString();
matchList.add(match);
index = m.end();
}
}
// If no match was found, return this
if (index == 0)
return new String[] {input.toString()};
// Add remaining segment
if (!matchLimited || matchList.size() < limit)
matchList.add(input.subSequence(index, input.length()).toString());
// Construct result
int resultSize = matchList.size();
if (limit == 0)
while (resultSize > 0 && matchList.get(resultSize-1).equals(""))
resultSize--;
String[] result = new String[resultSize];
return matchList.subList(0, resultSize).toArray(result);
}
/**
* Splits the given input sequence around matches of this pattern.
*
* This method works as if by invoking the two-argument {@link
* #split(java.lang.CharSequence, int) split} method with the given input
* sequence and a limit argument of zero. Trailing empty strings are
* therefore not included in the resulting array.
*
* The input "boo:and:foo", for example, yields the following
* results with these expressions:
*
*
* Regex
* Result
* :
* { "boo", "and", "foo" }
* o
* { "b", "", ":and:f" }
*
*
*
* @param input
* The character sequence to be split
*
* @return The array of strings computed by splitting the input
* around matches of this pattern
*/
public String[] split(CharSequence input) {
return split(input, 0);
}
String[] strArr = split(CharSequence string, int limit);
Pattern类中的split()方法将字符串按照正则表达式的规则进行拆分组装并生成为字符数组。
有趣的是,String类中也存在split方法,并且它会调用Pattern类中的的split方法,这说明了,String类也使用并且拓展了Pattern类的某些功能。
matcher类保留匹配的结果及状态。先pattern字符串,再matcher这个pattern,最后自己想要什么就根据matcher取什么。
/**
* Attempts to match the entire region against the pattern.
*
* If the match succeeds then more information can be obtained via the
* start, end, and group methods.
*
* @return true if, and only if, the entire region sequence
* matches this matcher's pattern
public boolean matches() {
return match(from, ENDANCHOR);
}
*/
boolean result = matches() {}; // 验证字符串
基本用法:
System.out.println(Pattern.compile("规则").matcher("要匹配的对象").matches());
工作中用到的举例:
public static boolean checkEmail(String email) {
String regex = "\\w+@\\w+\\.[a-z]+(\\.[a-z]+)?";
return Pattern.matches(regex, email);
}