Shell中经常用到sed,全称stream editor,顾名思义是用来处理文本信息的。官方解释如下:
“sed (stream editor) isn't an interactive text editor. Instead, it is used to filter text, i.e., it takes text input, performs some operation (or set of operations) on it, and outputs the modified text. sed is typically used for extracting part of a file using pattern matching or substituting multiple occurrences of a string within a file.”
下面介绍几种使用sed的方法:
sed是以行为单位进行处理。
假设处理文件为sedexample.txt,内容如下:
This is my dog, whose name is Frank. This is my fish, whose name is George. This is my goat, whose name is Adam.
将my替换成his
[root@localhost sedtest]# sed 's/my/her/g' sedexample.txt This is her dog, whose name is Frank. This is her fish, whose name is George. This is her goat, whose name is Adam.s为替换,支持选定特定行进行替换,比如2,$s,意思为第二行到最后一行的内容进行替换;/my/待匹配部分,支持正则表达式;her/将匹配部分替换成的部分;g为全部替换,支持替换一行中的第几个(1,2,...,N);
将3-5行第二个is替换成abc
[root@localhost sedtest]# sed '3,5s/is/abc/2' sedexample.txt This is my dog, whose name is Frank. This abc my fish, whose name is George. This abc my goat, whose name is Adam.
使用-e 或者 分号;隔开不同替换表达式。
比如我们希望将3-5行第二个is替换成abc且最后一行的name替换成pet
[root@localhost sedtest]# sed '3,5s/is/abc/2; $s/name/pet/g' sedexample.txt This is my dog, whose name is Frank. This abc my fish, whose name is George. This abc my goat, whose pet is Adam.
或者是
[root@localhost sedtest]# sed -e '3,5s/is/abc/2' -e '$s/name/pet/g' sedexample.txt This is my dog, whose name is Frank. This abc my fish, whose name is George. This abc my goat, whose pet is Adam.
3.删除
使用d,删除满足条件的行
比如,删除带有whose的行
[root@localhost sedtest]# sed '/whose/d' sedexample.txt This is my dog, This is my fish, This is my goat,
删除1,2,3行
[root@localhost sedtest]# sed '1,3d' sedexample.txt whose name is George. This is my goat, whose name is Adam.
圆括号括起的正则表达式所匹配的字符串,可以用\1,\2,...变量表示
5.打印
只想输出特定行,可以使用p;
比如只想输出最后一行:
[root@localhost sedtest]# sed -n '$p' sedexample.txt whose name is Adam.
只输出带Frank的行
[root@localhost sedtest]# sed -n '/Frank/p' sedexample.txt whose name is Frank.
6.向前插入和向后添加行
向前插入:i(insert),
向后添加:a(append)
第一行前插入#start
[root@localhost sedtest]# sed '1i #start' sedexample.txt #start This is my dog, whose name is Frank. This is my fish, whose name is George. This is my goat, whose name is Adam.
[root@localhost sedtest]# sed '$a #end' sedexample.txt This is my dog, whose name is Frank. This is my fish, whose name is George. This is my goat, whose name is Adam. #end
含有fish的行前插入一行#like it
[root@localhost sedtest]# sed '/fish/i #like it' sedexample.txt This is my dog, whose name is Frank. #like it This is my fish, whose name is George. This is my goat, whose name is Adam.
将特定行替换,使用参数c
将第二至五行替换成This is my fox
[[root@localhost sedtest]# sed '2,5c This is my fox' sedexample.txt This is my dog, This is my fox whose name is Adam.
[root@localhost sedtest]# sed '/fish/c ######' sedexample.txt This is my dog, whose name is Frank. ###### whose name is George. This is my goat, whose name is Adam.
以上操作都未改变源文件sedexample.txt;如果要改变源文件请使用参数:-i。
这里只列举了常用的方法,更多请参考:
http://www.grymoire.com/Unix/Sed.html
http://www.gnu.org/software/sed/manual/sed.html
思考题:有一个xml文件,怎么去掉<tag>, 只显示内容?
<head> <title>Colombia Earthquake</title> </head> <body> <headline> <hl1>143 Dead in Colombia Earthquake</hl1> </headline> <byline> <bytag>By Jared Kotler, Associated Press Writer</bytag> </byline> <dateline> <location>Bogota, Colombia</location> <date>Monday January 25 1999 7:28 ET</date> </dateline> </body>
Colombia Earthquake 143 Dead in Colombia Earthquake By Jared Kotler, Associated Press Writer Bogota, Colombia Monday January 25 1999 7:28 ET
答案:sed 's/<[^>]*>//g' file