greedy, reluctant, possessive 三种匹配模式在 http://docs.oracle.com/javase/tutorial/essential/regex/quant.html有详细介绍,
greedy表示每次eat entire string, if not matched, back off one character step by step until matches, so for the remaining string; reluctant means eat "nothing" at first, if not matched, then eat one character step by step until matches, so for the remaining string; possessive is like greedy, eat the entire string at first, but it never backs off any character if not matched.
关于jdk api中的Special constructs (named-capturing and non-capturing):
(?:X)表示不将X捕获为一个组
Pattern p = Pattern.compile("windows(?:98)(OS)");
String str = "windows98OS";
Matcher m = p.matcher(str);
while(m.find()){
System.out.println(m.group(1));
System.out.println(m.groupCount());
}
结果:
OS
1
其中group(0)表示entire string, 并不包含在groupCount里,这里要注意。上面代码可以看出98并没有被捕获成一个组。
其中(?=X)表示lookahead, 存在X
//positive lookahead
Pattern p = Pattern.compile("windows(?=98)");
String str = "windows98OS";
Matcher m = p.matcher(str);
while(m.find()){
System.out.println(m.group());
System.out.println(m.groupCount());
}
输出:
windows
0
其中(?=98)只是做为判断条件,并不consume字符,因此98也参与下次的匹配。
windows(?=98)表示windows后面要有98,否则不匹配,但98并不参与匹配。
(?!X) 表示negative lookahead, negative表示否定,即不能为。
(?<X) positive lookbehind, 向后找,存在X
(?<!X) negative lookbehind, 向后找,不存在X
Pattern p = Pattern.compile("windows98(?<!95)");
String str = "windows98OS";
Matcher m = p.matcher(str);
while(m.find()){
System.out.println(m.group());
System.out.println(m.groupCount());
}
注意:(?X) (?<X) 等这些都不消耗字符,也就不包含在组里,只是纯粹的判断条件。
还有一个 (?>X) X, as an independent, non-capturing group, 不知道怎么用,知道的兄弟请给我留言。
JS正则:参考 http://www.jb51.net/article/28007.htm http://www.2cto.com/kf/201204/128406.html http://blog.csdn.net/ethanq/article/details/6869055
下面是群里一兄弟问的问题,一个字符串,如果末尾是数字则返回所有数字,如果没数字,返回0,例如4323Fasfa143, 则返回143, fasfa343a 返回0
下面分别是java和js版本
String str = "4324Ad34FDAsd243";
Pattern p = Pattern.compile("^(\\w*?)(\\d+)$");
Matcher m = p.matcher(str);
boolean found = false;
while(m.find()){
found = true;
System.out.println(m.group());
System.out.println(m.group(2));
}
if(!found){
System.out.println(0);
}
<SCRIPT>
var reg = /^([\W\w]*?)(\d+)$/ig
var str = '432fdsafa243fafa1343';
var arr = reg.exec(str);
alert(arr[2]);
</SCRIPT>
其中js版本中要判断下reg.exec是否返回null,返回null表示末尾不是数字。