按内容截取行中的某字段(列)

1.按分隔符截取
 例如,截取有固定格式的日志
#cat myfile
2012-01-12 23:00:00 org.umessage.subway.ClientLog--->uuid:123212342,pid:ewldkjf,PUB_ID:3837482931,phone:372901111
2012-01-12 23:00:00 org.umessage.subway.ClientLog--->uuid:123212342,pid:ewldkjf,PUB_ID:3837482931,phone:372901111
2012-01-12 23:00:00 org.umessage.subway.ClientLog--->uuid:123212342,pid:ewldkjf,PUB_ID:3837482931,phone:372901111

#cat myfile | awk -F' ClientLog---> ' '{print $2}'|awk -F',' '{print $1":"$3}'|awk -F':' '{print $2 $4}'
123212342,3837482931
123212342,3837482931
123212342,3837482931


2.按内容截取
例如,要截取myfile日志中uuid内容和PUB_ID内容,字段长度固定,位置不确定:
# cat myfile
 2012-01-12 23:00:00 org.umessage.subway.ClientLog--->uuid:123212342,pid:ewldkjf,PUB_ID:3837482931,phone:372901111
2012-01-12 23:00:00 org.umessage.subway.ClientLog--->uuid:123212342,pid:ewldkjf,PUB_ID:3837482931,phone:372901111
2012-01-12 23:00:00 org.umessage.subway.ClientLog--->uuid:123212342,pid:ewldkjf,PUB_ID:3837482931,phone:372901111
2012-01-12 23:00:00 org.umessage.subway.ClientLog--->uuid:123212342,pid:ewldkjf,phone:372901111,PUB_ID:3837482931,
2012-01-12 23:00:00 org.umessage.subway.ClientLog--->uuid:123212342,pid:ewldkjf,PUB_ID:3837482931,phone:372901111

# cat  myfile  |  sed 's/ .*uuid:\(.........\).*PUB_ID:\(..........\).*/\1 ,\2/'

123212342,3837482931
123212342,3837482931
123212342,3837482931
123212342,3837482931
123212342,3837482931

sed命令解释:意思是所有把匹配 .*uuid:\(.........\).*PUB_ID:\(..........\).*的行替换成被匹配到的括号中的内容输出;
其中各选项参数以/分割,s表示替换,\1表示前面将被替换的正则表达式中第一个括号里的内容,\n表示第n组的内容,  .表示匹配任意一个字符,*表示前面项匹配任意次

参考:http://www.iteye.com/topic/587673
           http://wenku.baidu.com/view/bd0c155e312b3169a451a4de.html

你可能感兴趣的:(正则表达式,shell,shell,shell)