一、linux中经常和正则表达式联合使用的工具
grep
sed
awk(自己去研究吧).
二,以grep为例,有以下正则操作
特殊符号 | 代表意义 |
[:alnum:] | 代表英文大小写字符及数字,亦0-9,A-Z,a-z |
[:alpha:] | 代表任何英文大小写字符,亦A-Z,a-z |
[:blank:] | 代表空格键与[Tab]按键两者 |
[:cntrl:] | 代表键盘上面的控制按键,亦即包括CR、LF、Tab、Del..等等 |
[:digit:] | 代表数字而已,亦即0-9 |
[:graph:] | 除了空格符(空格键与[Tab]按键)外的其他所有按键 |
[:lower:] | 代表小写字符,亦即a-z |
[:print:] | 代表任何可以被打印出来的字符 |
[:punct:] | 代表标点符号(punctuation,symbol),亦即A:" ' ?!;:#$... |
[:upper:] | 代表大写字符,亦即A-Z |
[:space:] | 任何会产生空白的字符,包括空格键,[Tab],CR等等 |
[:xdigit:] | 代表16进位的数字类型,因此包括:0-9,A-F,a-f的数字与字符 |
RE字符 | 意义与范例 |
^word | 意义:待搜寻的字符串(word)在行首! 范例:搜寻行首为#开始的那一行,并列出行号 grep -n '^#' regular_express.txt |
word$ | 意义:带搜寻的字符串(word)在行尾! 范例:将行尾为!的那一行打印出来,并列出行号 grep -n '!$' regular_express.txt |
. | 意义:代表一定有一个任意字符的字符! 范例:搜寻的字符串可以是(eve) (eae) (eee) (e e),但不能仅有(ee) ! 亦即e与e中间一定仅有一个字符,而空格符也是字符! grep -n 'e.e' reg |
\ | 意义:跳脱字符,将特殊符号的特殊意义去除! 范例:搜寻含有单引号'的那一行! grep -n \' regular_express.txt |
* | 意义:重复零个到无穷多个的前一个RE字符 范例:找出含有(es)(ess)(esss)等等的字符串,注意,因为*可以是0个,所以es也是符合待搜寻字符串。另外,因为*为重复前一个RE字符的符号,因此,在*之前必须要紧接着一个RE字符喔!例如任意字符则为 .* |
[list] | 意义:字符集合的RE字符,里面列出想要的字符! 范例:搜寻含有(gl)或(gd)的那一行,需要特别留意的是,在[]当中仅代表一个待搜寻的字符,例如a[afl]y代表搜寻的字符串可以是aay,afy,aly即[afl]代表a或f或l的意思! grep -n 'g[ld]' regular_express.txt |
[n1-n2] | 意义:字符集合的RE字符,里面列出想要获取的字符范围! 范例:搜寻含有任意数字的那一行!需特别留意,在字符集合[]中的减号 - 是有特殊意义的,他代表两个字符之间的所有连续字符!但这个连续与否与ASCII编码有关,因此,你的编码需要设定正确(在bash当中,需要确定LANG与LANGUAGE的变量是否正确!) 例如所有大写字符则为[A-Z] grep -n '[0-9]' regular_express.txt 或 grep -n '[[:digit:]]' regular_express.txt |
[^list] | 意义:字符集合的RE字符,里面列出不要的字符串或范围! 范例:搜寻的字符串可以是(oog) (ood) 但不能是(oot),那个^在[]内时,代表的意义是反向选择的意思。例如,我不要大写字符,则为[^A-Z]。但是,需要特别注意的是,如果以grep -n [^A-Z] regular_express.txt 来搜寻,却发现该档案内的所有行都被列出,为什么?因为这个[^A-Z]是非大写字符的意思,因为每一行均有非大写字符,例如第一行的"Open Source" 就有p,e,n,o...等等的小写字 grep -n 'oo[^t]' regular_express.txt |
\{n,m\} | 意义:连续n到m个的前一个RE字符 意义:若为\{n\}则是连续n个的前一个RE字符。 意义:若是\{n,\}则是连续n个以上的前一个RE字符! 范例:在g与g之间有2个到3个的o存在的字符串,亦即(goog)(gooog) grep -n 'go\{2,3\}g' regular_express.txt |
RE字符 | 意义与范例 |
+ | 意义:重复一个或一个以上的前一个RE字符 范例:搜寻(god)(good)(goood)...等等的字符串。那个o+代表一个以上o,所以,底下的执行成果会将1,9,13行列出来。 egrep -n 'go+d' regular_express.txt |
? | 意义:零个或一个的前一个RE字符 范例:搜寻(gd)(god)这两个字符串。那个o?代表空的或1个o,所以,上面的执行成果会将第13,14行列出来。有没有发现到,这两个案例('go+d'与’go?d')的结果集合与'go*d'相同?想想看,为啥呢 egrep -n 'go?d' regular_express.txt |
| | 意义:用或(or)的方法找出数个字符串 范例:搜寻gd或good这两个字符串,注意,是或!所以,第1,9,14这三行都可以被打印出来,那如果还想找出dog呢? egrep -n 'gd|good' regular_express.txt egrep -n 'gd|good|dog' regular_express.txt |
() | 意义:找出群组字符串 范例:搜寻(glad)或(good)这两个字符串,因为g与d是重复的,所以,我就可以将la与oo列于()当中,并以|来分隔开来,就可以啦! egrep -n 'g(la|oo)d' regular_express.txt |
()+ | 意义:多个重复群组的判别 范例:将AxyzxyzxyzxyzC 用echo叫出,然后再使用如下的方法搜寻一下! echo 'AxyzxyzxyzC'|egrep 'A(xyz)+C' 上面的例子意思是说,我要找开头是A结尾是C,中间有一个以上的'xyz'字符串的意思。 |
三、sed工具
1) 以行为单位的新增/删除功能
删除2-5行
[dalianmao@localhost ~]$ nl regular_express.txt |sed '2,5d'
1 "Open Source" is a good mechanism to develop programs.
6 GNU is free air not free beer.
7 Her hair is very beauty.
8 I can't finish the test.
9 Oh! The soup taste good.
10 motorcycle is cheap than car.
11 This window is clear.
12 the symbol '*' is represented as start.
13 Oh! My god!
14 The gd software is a library for drafting programs.
15 You are the best is mean you are the no. 1.
16 The world is the same with "glad".
17 I like dog.
18 google is the best tools for search keyword.
19 goooooogle yes!
20 go! go! Let's go.
21 # I am VBird
删除第6行
[dalianmao@localhost ~]$ nl regular_express.txt |sed '6d'
1 "Open Source" is a good mechanism to develop programs.
2 apple is my favorite food.
3 Football game is not use feet only.
4 this dress doesn't fit me.
5 However, this dress is about $ 3183 dollars.
7 Her hair is very beauty.
8 I can't finish the test.
9 Oh! The soup taste good.
10 motorcycle is cheap than car.
11 This window is clear.
12 the symbol '*' is represented as start.
13 Oh! My god!
14 The gd software is a library for drafting programs.
15 You are the best is mean you are the no. 1.
16 The world is the same with "glad".
17 I like dog.
18 google is the best tools for search keyword.
19 goooooogle yes!
20 go! go! Let's go.
21 # I am VBird
删除第6行到最后一行
[dalianmao@localhost ~]$ nl regular_express.txt |sed '6,$d'
1 "Open Source" is a good mechanism to develop programs.
2 apple is my favorite food.
3 Football game is not use feet only.
4 this dress doesn't fit me.
5 However, this dress is about $ 3183 dollars.
在第二行后(亦即是加在第三行)加上dalianmao字样
[dalianmao@localhost ~]$ nl regular_express.txt |sed '2a dalianmao'
1 "Open Source" is a good mechanism to develop programs.
2 apple is my favorite food.
dalianmao
3 Football game is not use feet only.
4 this dress doesn't fit me.
5 However, this dress is about $ 3183 dollars.
6 GNU is free air not free beer.
7 Her hair is very beauty.
8 I can't finish the test.
9 Oh! The soup taste good.
10 motorcycle is cheap than car.
11 This window is clear.
12 the symbol '*' is represented as start.
13 Oh! My god!
14 The gd software is a library for drafting programs.
15 You are the best is mean you are the no. 1.
16 The world is the same with "glad".
17 I like dog.
18 google is the best tools for search keyword.
19 goooooogle yes!
20 go! go! Let's go.
21 # I am VBird
增加三行如下,每行末尾用\分隔
[dalianmao@localhost ~]$ nl regular_express.txt |sed '2a dalianmao\
> 23:root\
> 45:fidss\'
1 "Open Source" is a good mechanism to develop programs.
2 apple is my favorite food.
dalianmao\
23:root\
45:fidss
3 Football game is not use feet only.
4 this dress doesn't fit me.
5 However, this dress is about $ 3183 dollars.
6 GNU is free air not free beer.
7 Her hair is very beauty.
8 I can't finish the test.
9 Oh! The soup taste good.
10 motorcycle is cheap than car.
11 This window is clear.
12 the symbol '*' is represented as start.
13 Oh! My god!
14 The gd software is a library for drafting programs.
15 You are the best is mean you are the no. 1.
16 The world is the same with "glad".
17 I like dog.
18 google is the best tools for search keyword.
19 goooooogle yes!
20 go! go! Let's go.
21 # I am VBird
注:如果增加在前面,把a换成i即可。
2) 以行为单位的取代与显示功能
sed [-nefr][动作]
选项与参数:
-n :使用安静(silent)模式。在一般sed的用法中,所有来自STDIN的数据一般都会被列出到屏幕上。但如果加上-n参数后,则只有经过sed特殊处理的那一行(或者动作)才会被列出来。
-e :直接在指令列模式上进行sed的动作编辑;
-i :直接修改读取的档案内容,而不是由屏幕输出。
动作说明:[n1[,n2]]function
n1,n2:不见得会存在,一般代表选择进行动作的行数,举例来说,如果我的动作是需要在10到20行之间进行的,则10,20[动作行为]
function有如下这些:
a:新增,a的后面可以接字符串,而这些字符串会在新的一行出现(目前的下一行)
c:取代,c的后面可以接字符串,这些字符串可以取代n1,n2之间的行
d:删除,因为是删除,所以d后面通常不接任何东西;
i:插入,i的后面可以接字符串,而这些字符串会在新的一行出现(目前的上一行);
s:取代,可以直接进行取代的工作,通常这个s的动作可以搭配正则表达式,例如'1,20s/old/new/g'就是了,如果new为变量的话,外面用双引号。
将第2-5 行的内容取代成为 'No 2-5 number'
[dalianmao@localhost ~]$ nl regular_express.txt |sed '2,5c No 2-5 number'
1 "Open Source" is a good mechanism to develop programs.
No 2-5 number
6 GNU is free air not free beer.
7 Her hair is very beauty.
8 I can't finish the test.
9 Oh! The soup taste good.
10 motorcycle is cheap than car.
11 This window is clear.
12 the symbol '*' is represented as start.
13 Oh! My god!
14 The gd software is a library for drafting programs.
15 You are the best is mean you are the no. 1.
16 The world is the same with "glad".
17 I like dog.
18 google is the best tools for search keyword.
19 goooooogle yes!
20 go! go! Let's go.
21 # I am VBird
列出5-7行内容
[dalianmao@localhost ~]$ nl regular_express.txt |sed -n '5,7p'
5 However, this dress is about $ 3183 dollars.
6 GNU is free air not free beer.
7 Her hair is very beauty.
3) 部分数据的搜寻并取代的功能
sed 's/要被取代的字符串/新的字符串/g'
获取ip地址
[dalianmao@localhost ~]$ /sbin/ifconfig eth0|grep 'inet addr'|sed 's/^.*addr://g'|sed 's/Bcast.*$//g'
192.168.235.129
只要MAN存在的那几行数据,但是含有#在内的批注我不要,而且空白行也不要
[dalianmao@localhost ~]$ cat /etc/man.config |grep 'MAN'|sed 's/#.*$//g'|sed '/^$/d'
MANPATH /usr/man
MANPATH /usr/share/man
MANPATH /usr/local/man
MANPATH /usr/local/share/man
MANPATH /usr/X11R6/man
MANPATH_MAP /bin /usr/share/man
MANPATH_MAP /sbin /usr/share/man
MANPATH_MAP /usr/bin /usr/share/man
MANPATH_MAP /usr/sbin /usr/share/man
MANPATH_MAP /usr/local/bin /usr/local/share/man
MANPATH_MAP /usr/local/sbin /usr/local/share/man
MANPATH_MAP /usr/X11R6/bin /usr/X11R6/man
MANPATH_MAP /usr/bin/X11 /usr/X11R6/man
MANPATH_MAP /usr/bin/mh /usr/share/man
MANSECT 1:1p:8:2:3:3p:4:5:6:7:9:0p:n:l:p:o:1x:2x:3x:4x:5x:6x:7x:8x
4)直接修改档案内容(危险操作!!!!)
将regular_express.txt 内每一行结尾若为.,则换成!
[dalianmao@localhost ~]$ sed -i 's/\.$/\!/g' regular_express.txt
[dalianmao@localhost ~]$ cat regular_express.txt
"Open Source" is a good mechanism to develop programs!
apple is my favorite food!
Football game is not use feet only!
this dress doesn't fit me!
However, this dress is about $ 3183 dollars.
GNU is free air not free beer.
Her hair is very beauty.
I can't finish the test.
Oh! The soup taste good.
motorcycle is cheap than car!
This window is clear!
the symbol '*' is represented as start!
Oh! My god!
The gd software is a library for drafting programs.
You are the best is mean you are the no. 1!
The world is the same with "glad"!
I like dog!
google is the best tools for search keyword!
goooooogle yes!
go! go! Let's go!
# I am VBird
利用sed直接在regular_express.txt 最后一行加入 #Hi,dalianmao
[dalianmao@localhost ~]$ sed -i '$a # Hi,dalianmao' regular_express.txt
[dalianmao@localhost ~]$ cat regular_express.txt
"Open Source" is a good mechanism to develop programs!
apple is my favorite food!
Football game is not use feet only!
this dress doesn't fit me!
However, this dress is about $ 3183 dollars.
GNU is free air not free beer.
Her hair is very beauty.
I can't finish the test.
Oh! The soup taste good.
motorcycle is cheap than car!
This window is clear!
the symbol '*' is represented as start!
Oh! My god!
The gd software is a library for drafting programs.
You are the best is mean you are the no. 1!
The world is the same with "glad"!
I like dog!
google is the best tools for search keyword!
goooooogle yes!
go! go! Let's go!
# I am VBird
# Hi,dalianmao
注:通过 wget http://linux.vbird.org/linux_basic/0330regularex/regular_express.txt 自行下载文档练习