下面为在Editplus中使用内置的正则表达式的帮助
Regular expression is a search string that contains normal text plus special characters which indicate extended searching options. Regular expression allows more sophisticated search and replace.
For example, you can find any digit by using regular expression "[0-9]". Similarly you can find any matching character that is NOT digit by using regular expression "[^0-9]".
EditPlus supports following regular expressions in Find, Replace and Find in Files command.
Expression Description
\t Tab character.
\n New line.
. Matches any character.
| Either expression on its left and right side matches the target string. For example, "a|b" matches "a" and "b".
[] Any of the enclosed characters may match the target character. For example, "[ab]" matches "a" and "b". "[0-9]" matches any digit.
[^] None of the enclosed characters may match the target character. For example, "[^ab]" matches all character EXCEPT "a" and "b". "[^0-9]" matches any non-digit character.
* Character to the left of asterisk in the expression should match 0 or more times. For example "be*" matches "b", "be" and "bee".
+ Character to the left of plus sign in the expression should match 1 or more times. For example "be+" matches "be" and "bee" but not "b".
? Character to the left of question mark in the expression should match 0 or 1 time. For example "be?" matches "b" and "be" but not "bee".
^ Expression to the right of ^ matches only when it is at the beginning of line. For example "^A" matches an "A" that is only at the beginning of line.
$ Expression to the left of $ matches only when it is at the end of line. For example "e$" matches an "e" that is only at the end of line.
() Affects evaluation order of expression and also used for tagged expression.
\ Escape character. If you want to use character "\" itself, you should use "\\".
The tagged expression is enclosed by (). Tagged expressions can be referenced by \0, \1, \2, \3, etc. \0 indicates a tagged expression representing the entire substring that was matched. \1 indicates the first tagged expression, \2 is the second, etc. See following examples.
Original Search Replace Result
abc (ab)(c) \0-\1-\2 abc-ab-c
abc a(b)(c) \0-\1-\2 abc-b-c
abc (a)b(c) \0-\1-\2 abc-a-c
比较常见的正则表达式匹配元素
^ 行首,在[^]表示非指定字符集
$ 行尾
\b \B 表示单词的边界
[] 表示从中选择一个字符
\d 任意数字
\s 空格
\w 词的字符类,如果a-zA-Z 0-9 _等
\D \S \W以上的取反结果
* 一般情况下表示0次或多次出现,也可以用于选择匹配模式的贪婪和懒惰匹配
+ 表示出现一次或者多次,必须存在
? 表示出现一次或不出现,同样可以用于匹配贪婪与懒惰模式
{m,n} 表示允许出现最小m次,最多n次
{m,} 表示最小m,无上限
{m} 表示必须出现m次
| 表示从两个匹配冲选择一个,注意该匹配的优先级很低
() 简单的使用,就是用于匹配结果进行分组提取,以及配合*?+等进行多条件组合匹配
i m g 常用于在匹配正则表示式时,进行多行,忽略大小写的设置
高级使用 (?...) 等
(?:reg) 表示进行分组,当不会包含在匹配的捕获分组内,成为非捕获分组
(?=reg) 具体用途未详
(?!reg) 表示该处不包含reg所匹配的内容
(?>reg) 用于阻止回朔.可以提高性能,具体用途未详
常见语言中使用正则的方法
JavaScript:
主要为RegExp对象的test,exec以及字符串中的match,split,search,replace方法
声明一个正则表达式的方式有
1; var patten=/\d+/
2: var patten = new RegExp(\d+)
test 方法主要用于判断是否指定字符串是否符合正则条件,根据条件,默认只要字符串中部分完全匹配正则即可,然后返回布尔值
alert(patten.test('23123')) -----true
exec 方法用于进行字符串匹配,然后返回匹配分组后的数组,如果没有匹配结果,将会返回null
alert(patten.test('23123')) ------23123,注意为数组 alert的时候会自动转换成字符串
match方法,来自字符串对象,效果如同exec方法一样
alert('23123'.match(d))
其他字符串方法不做介绍
Java:
主要使用java.util.regex.Pattern和java.util.regex.Matcher;这两个类,同样也可以在字符串方法使用正则表达式
构建一个Pattern,注意可以使用参数重载,进行模式的设置,如换行等,使用Pattern中的常量
常见的使用方法
Pattern ptn = Pattern.compile("\\d");
String str = "213123";
Matcher mather = ptn.matcher(str);
while(mather.find()){
System.out.println(mather.group());
}
其中group方法,可以通过分组取出不同的值,通过传入参数index实现
Python与Groovy待补充