小括号的用途:
1.限制多选项(alternation)的范围;
2.使用「|」将若干字符组合为一个单元,受问号或星号之类量词的作用;
3.反向引用(backreference);表现形式为元字符序列「\1…」
例如,
% egrep -i '\<([a-z]+) +\1\>' files…
转义符:在除了字符组内部之外使用反斜线(backslash),使元字符失去特殊含义,成为普通字符。
比如:
「\.」:转义的点号
1.5.1 A Few More Examples
1.5.1.1 A string within double quotes
A simple solution to matching a stringwithin double quotes might be: 「”[^”]*”」
两端的引号用来匹配字符串开头和结尾的引号。在这两个引号之间的文本可以包括双引号之外的任何字符。所以我们使用「[^*]」来匹配除双引号之外的任何字符,用「*」来表示两个引号之间可以存在任意数目的非双引号字符。
1.5.2.1 Regex
正则表达式,简称正则(Regex)
1.5.2.2 Matching
正则表达式「a」不能匹配cat,但是能匹配cat中的a。
1.5.2.3 Metacharacter
只有在字符组外部并且是在未转义的情况下,才有意义。
1.5.2.4 Flavor
我们主要讲Perl流派。
1.5.2.5 Subexpression
“子表达式”指的是整个正则表达式中的一部分,通常是小括号内的表达式,或者是由「|」分隔的多选(alternation)分支。
1.5.2.6 Character
ASCII编码的字节
Egrep工具的元字符总结。
Table1-3. Egrep Metacharacter Summary
Items to Match a Single Character |
||
Metacharacter |
Matches |
|
. |
dot |
Matches any one character |
[…] |
character class |
Matches any one character listed |
[^…] |
negated character class |
Matches any one character not listed |
\char |
escaped character |
When char is a metacharacter, or the escaped combination is not otherwise special, matches the literal char |
Items Appended to Provide “Counting”: The Quantifiers |
||
? |
question |
One allowed, but it is optional |
* |
star |
Any number allowed, but all are optional |
+ |
plus |
At least one required; additional are optional |
{min, max} |
specified range† |
Min required, max allowed |
Items That Match a Position |
||
^ |
caret |
Matches the position at the start of the line |
$ |
dollar |
Matches the position at the end of the line |
\< |
word boundary† |
Matches the position at the start of a word |
\> |
word boundary† |
Matches the position at the end of a word |
Other |
||
| |
alternation |
Matches either expression it separates |
(…) |
parentheses |
Limits scope of alternation, provide grouping for the quantifies, and “captures” for backreferences |
\1, \2, ... |
backreference† |
Matches text previously matched within first, second, etc., set of parentheses. |