UltraEdit 与Unix 正则表达式
UltraEdit 允许在搜索菜单下面列出了的许多搜索和替换功能中使用正则表达式。正则表达式能让更多的复杂的搜索和替换功能变成简单的操作。(中文版界面上显示为“正规表达式”)
有两个可使用的语法集合。下面的第一表显示出在 UltraEdit 的更早的版本被使用的原来的 UltraEdit 句法。第二表给出了可选的"Unix"类型的正则表达式。这可以从配置单元启用。
符号 功能
% 匹配行的开始 - 显示搜索字符串必须在行的开始,但是在所选择的结果字符串中不包括任何行终止字符。
$ 匹配行尾 - 显示搜索字符串必须在行尾,但是在所选择的结果字符串中不包括任何行终止字符。
? 除了换行符以外匹配任何单个的字符
* 除了换行符匹配任何数量的字符和数字
+ 前一字符匹配一个或多个,但至少要出现一个
++ 前一字符匹配零个或多个,但至少要出现一个
^b 匹配一个分页
^p 匹配一个换行符(CR/LF)(段)(DOS文件)
^r 匹配一个换行符(CR 仅仅)(段)(MAC 文件)
^n 匹配一个换行符 ( LF 仅仅 )( 段 )( UNIX 文件 )
^t 匹配一个标签字符TAB
[] 匹配任何单个的字符,或在方括号中的范围
^{A^}^{ B^} 匹配表达式A或 B
^ 重载其后的正规表达式字符
^(^) 括或标注为用于替换命令的表达式。
一个正则表达式最多可以有9个标注表达式, 按正规表达式的需要而定。
相应的替换表达式是 ^x , 替换范围x是1-9。例如:
If ^(h*o^) ^(f*s^) matches "hello folks",
^2 ^1 would replace it with "folks hello".
(hello folks 将被替换成 folks hello。)
注: ^ 是实际字符 ^不是Ctl + 键值。
例如:
m?n 匹配 "man","men","min" 但不匹配 "moon".
t*t 匹配 "test","tonight" 和 "tea time" (the "tea t" portion) 但不匹配 "tea
time" (newline between "tea " and "time").
Te+st 匹配 "test","teest"," teeeest "等等。但是不匹配 "tst"。
[aeiou] 匹配每个小写元音。
[,.?] 匹配一文字的 ",","."或 "?"。
[0-9, a-z] 匹配任何数位,或小写字母。
[~0-9] 除了数字以外匹配任何字符 (~ 意味着"不")
你按如下方式可以查找一个表达式A或 B :
"^{John^}^{Tom^}"
这将在找John或Tom的出现。应该在 2 个表达式之间没有任何东西。
你可以在同一搜索中按如下方式组合A or B and C or D:
"^{John^}^{Tom^}^{Smith^}^{Jones^}"
这将在John or Tom 后面找 Smith or Jones。
下表为"Unix"句法类型的正则表达式。
正则表达式 (Unix句法):
符号 功能
标记下一个字符作为一个特殊的字符。
"n" 匹配字符"n"。"n" 一个换行符或换行符字符。
^ 匹配/定位行的开始。
$ 匹配/定位行的尾。
* 匹配前面的字符零次或多次。例
+ 匹配前面的字符一次或多次。例
. 匹配除了一个换行符字符匹配任何单个的字符。
(expression)标注用于替换命令的表达式。一个正则表达式根据需要,最多可以有9个标注表达式。相应的代替表达式是 x , x的范围是 1-9 。
例如:
If (h.*o) (f.*s) matches "hello folks",
2 1 would replace it with "folks hello".
(hello folks 将被替换成 folks hello。)
[xyz] 一个字符集。匹配在方括号之间的任何字符。
[^xyz] 一个否定的字符集。不匹配在方括号之间的任何字符。
d 匹配一个数字字符。等价于[0-9]。
D 匹配一个非数字字符。等价于[^0-9]。
f 匹配一个换页字符。
n 匹配一个换行字符。
r 匹配一个回车符字符。
s 匹配任何空白的空格, 标签, 换页, 包括空格等等,但不匹配换行符。
S 匹配任何非空白的字符,但不匹配换行符。
t 匹配一个标签TAB字符。
v 匹配一个垂直的标签字符。
w 匹配任何词语字符包括下划线。
W 匹配任何非词语字符字符。
注: ^ 是实际字符 ^不是Ctl + 键值。
例如:
m.n 匹配 "man","men","min" 但不匹配 "moon".
t+t 匹配 "test","tonight" 和 "tea time" (the "tea t" portion) 但不匹配 "tea
time" (newline between "tea " and "time").
Te*st 匹配 "test","teest"," teeeest "等等。但是不匹配 "tst"。
[aeiou] 匹配每个小写元音。
[,.?] 匹配一文字的 ",","."或 "?"。
[0-9,a-z] 匹配任何数位,或小写字母。
[^0-9] 除了数字以外匹配任何字符 (~ 意味着"不")
你按如下方式可以查找一个表达式A或 B :
"(John)|(Tom)"
这将在找John或Tom的出现。应该在 2 个表达式之间没有任何东西。
你可以在同一搜索中按如下方式组合A or B and C or D:
"(John|Tom) (Smith|Jones)"
这将在John or Tom 后面找 Smith or Jones。
另外:
p 匹配 CR/LF ( 作为 rn 的一样 ) 作为DOS行结束符匹配
如果查找/替换功能中正则表达式没有选用,则替换字段中下列字符也是有效的:
符号 功能
^^ 匹配一个 "^" 字符
^s 替换为被选择 ( 加亮 ) 活跃的文件窗口的文章。
^c 替换为剪贴板的内容
^b 匹配一个页裂缝
^p 匹配一个换行符 ( CR/LF )( 段 )( DOS 文件)
^r 匹配一个换行符 ( CR 仅仅 )( 段 )( MAC 文件)
^n 匹配一个换行符 ( LF 仅仅 )( 段 )( UNIX 文件)
^t 匹配一个标签TAB字符
Regular Expressions
UltraEdit allows for Regular Expressions in many of its search and
replace functions listed under the Search Menu.
Regular expressions allow more complex search and replace functions
to be performed in a single operation.
There are two possible sets of syntax that may be used. The first
table below shows the original UltraEdit syntax used in earlier
versions of UltraEdit. The second table shows the optional "Unix"
style regular expressions. This may be enabled from the
Configuration Section.
Regular Expressions (UltraEdit Syntax):
Symbol
Function
%
Matches the start of line - Indicates the search string must be at
the beginning of a line but does not include any line terminator
characters in the resulting string selected.
$
Matches the end of line - Indicates the search string must be at the
end of line but does not include any line terminator characters in
the resulting string selected.
?
Matches any single character except newline.
*
Matches any number of occurrences of any character except newline.
+
Matches one or more of the preceding character/expression. At least
one occurrence of the character must be found. Does not match
repeated newlines.
++
Matches the preceding character/expression zero or more times. Does
not match repeated newlines.
^b
Matches a page break.
^p
Matches a newline (CR/LF) (paragraph) (DOS Files)
^r
Matches a newline (CR Only) (paragraph) (MAC Files)
^n
Matches a newline (LF Only) (paragraph) (UNIX Files)
^t
Matches a tab character
[ ]
Matches any single character or range in the brackets
^{A^}^{B^}
Matches expression A OR B
^
Overrides the following regular expression character
^(區)
Brackets or tags an expression to use in the replace command. A
regular expression may have up to 9 tagged expressions, numbered
according to their order in the regular expression.
The corresponding replacement expression is ^x, for x in the range
1-9. Example: If ^(h*o^) ^(f*s^) matches "hello folks", ^2 ^1 would
replace it with "folks hello".
Note - ^ refers to the character '^' NOT Control Key + value.
Examples:
m?n matches "man", "men", "min" but not "moon".
t*t matches "test", "tonight" and "tea time" (the "tea t" portion)
but not "tea
time" (newline between "tea " and "time").
Te+st matches "test", "teest", "teeeest" etc. but does not match
"tst".
[aeiou] matches every lowercase vowel
[,.?] matches a literal ",", "." or "?".
[0-9a-z] matches any digit, or lowercase letter
[~0-9] matches any character except a digit (~ means NOT the
following)
You may search for an expression A or B as follows:
"^{John^}^{Tom^}?/SPAN>
This will search for an occurrence of John or Tom. There should be
nothing between the two expressions.
You may combine A or B and C or D in the same search as follows:
"^{John^}^{Tom^} ^{Smith^}^{Jones^}"
This will search for John or Tom followed by Smith or Jones.
The table below shows the syntax for the "Unix" style regular
expressions.
Regular Expressions (Unix Syntax):
Symbol
Function
\
Indicates the next character has a special meaning. "n" on it抯 own
matches the character "n". "\n" matches a linefeed or newline
character. See examples below (\d, \f, \n etc).
^
Matches/anchors the beginning of line.
$
Matches/anchors the end of line.
*
Matches the preceding character zero or more times.
+
Matches the preceding character one or more times. Does not match
repeated newlines.
.
Matches any single character except a newline character. Does not
match repeated newlines.
(expression)
Brackets or tags an expression to use in the replace command.A
regular expression may have up to 9 tagged expressions, numbered
according to their order in the regular expression.
The corresponding replacement expression is \x, for x in the range
1-9. Example: If (h.*o) (f.*s) matches "hello folks", \2 \1 would
replace it with "folks hello".
[xyz]
A character set. Matches any characters between brackets.
[^xyz]
A negative character set. Matches any characters NOT between
brackets.
\d
Matches a digit character. Equivalent to [0-9].
\D
Matches a nondigit character. Equivalent to [^0-9].
\f
Matches a form-feed character.
\n
Matches a linefeed character.
\r
Matches a carriage return character.
\s
Matches any whitespace including space, tab, form-feed, etc but not
newline.
\S
Matches any non-whitespace character but not newline.
\t
Matches a tab character.
\v
Matches a vertical tab character.
\w
Matches any word character including underscore.
\W
Matches any nonword character.
\p
Matches CR/LF (same as \r\n) to match a DOS line terminator
Note - ^ refers to the character '^' NOT Control Key + value.
Examples:
m.n matches "man", "men", "min" but not "moon".
Te+st matches "test", "teest", "teeeest" etc. BUT NOT "tst".
Te*st matches "test", "teest", "teeeest" etc. AND "tst".
[aeiou] matches every lowercase vowel
[,.?] matches a literal ",", "." or "?".
[0-9a-z] matches any digit, or lowercase letter
[^0-9] matches any character except a digit (^ means NOT the
following)
You may search for an expression A or B as follows:
"(John|Tom)"
This will search for an occurrence of John or Tom. There should be
nothing between the two expressions.
You may combine A or B and C or D in the same search as follows:
"(John|Tom) (Smith|Jones)"
This will search for John or Tom followed by Smith or Jones.
If Regular Expression is not selected for the find/replace and in the
Replace field the following special characters are also valid:
Symbol
Function
^^
Matches a "^" character
^s
Is substituted with the selected (highlighted) text of the active
file window.
^c
Is substituted with the contents of the clipboard.
^b
Matches a page break
^p
Matches a newline (CR/LF) (paragraph) (DOS Files)
^r
Matches a newline (CR Only) (paragraph) (MAC Files)
^n
Matches a newline (LF Only) (paragraph) (UNIX Files)
^t
Matches a tab character
Note - ^ refers to the character '^' NOT Control Key + value.
Pasted from <http://www.niwota.com/submsg/1966636/>