一、工作经验总结.
(1)日志案例:
10.100.194.39 10.100.194.39 1019-03-16T11:01:04+08:00 www.uuwatch.com^^3FF91DE01BCB49B8BD11198DB98394F8|1993969314640 GET 499 /agent-config?appId=biz.marketing&host=10.101.181.194&hostName=wg1-biz-marketing-139-1161719-11193146.c.elenet.me httphttp1.1 0.000- 146 0 0 10.100.38.36:1891 - - - www.uuwatch.com user - wg1 wg1-channel-stable-1Nginx 1.6.1 wg1-bjdev-pub-nginxms-1 Python-urllib/1.7 - - - www.uuwatch.com 10.100.194.39 10.100.194.39 1019-03-16T11:07:16+08:00 www.uuwatch.com^^C9398ACF399743669E71A19613D9F498|1993969646084 GET 499 /collector/tcp/cluster?cluster=wg1-collector-esm http http1.1 0.001 - 191 0 0 10.100.38.49:1891 application/json - - www.uuwatch.com user - wg1 wg1-channel-stable-1 Nginx 1.6.1wg1-bjdev-pub-nginxms-1 Go-http-client/1.1 - - - www.uuwatch.com 10.100.194.39 10.100.194.39 1019-03-16T11:11:46+08:00 www.uuwatch.com^^1319B1E9EA1A4B9A9E498317606B74B7|1993969906741 GET 499 /agent-config?appId=arch.waf_collector&host=10.101.130.196 http http1.1 0.000 - 169 0 0 10.100.40.17:1891 -- - www.uuwatch.com user - wg1 wg1-channel-stable-1 Nginx 1.6.1 wg1-bjdev-pub-nginxms-1 Go-http-client/1.1 -- - www.uuwatch.com
只输出日志从第三列到最后的输出:
<1>cut -f 3- demo Reference:https://stackoverflow.com/questions/1602035/how-to-print-third-column-to-last-column
1019-03-16T11:01:04+08:00 www.uuwatch.com^^3FF91DE01BCB49B8BD11198DB98394F8|1993969314640 GET 499 /agent-config?appId=biz.marketing&host=10.101.181.194&hostName=wg1-biz-marketing-139-1161719-11193146.c.elenet.me httphttp1.1 0.000 - 146 0 0 10.100.38.36:1891 - - - www.uuwatch.com user - wg1 wg1-channel-stable-1 Nginx 1.6.1 wg1-bjdev-pub-nginxms-1 Python-urllib/1.7 - - - www.uuwatch.com 1019-03-16T11:07:16+08:00 www.uuwatch.com^^C9398ACF399743669E71A19613D9F498|1993969646084 GET 499 /collector/tcp/cluster?cluster=wg1-collector-esm http http1.1 0.001 - 191 0 0 10.100.38.49:1891 application/json - - www.uuwatch.com user - wg1 wg1-channel-stable-1 Nginx 1.6.1wg1-bjdev-pub-nginxms-1 Go-http-client/1.1 - - -www.uuwatch.com 1019-03-16T11:11:46+08:00 www.uuwatch.com^^1319B1E9EA1A4B9A9E498317606B74B7|1993969906741 GET 499 /agent-config?appId=arch.waf_collector&host=10.101.130.196 http http1.1 0.000 - 169 0 0 10.100.40.17:1891 - - - www.uuwatch.com user - wg1 wg1-channel-stable-1 Nginx 1.6.1 wg1-bjdev-pub-nginxms-1 Go-http-client/1.1 - - -www.uuwatch.com
<2>第二种方法:
awk '
{
for(i=3;i<=NF;i++)
rec[i]=(rec[i]?rec[i]RS$i:$i)
}
END {
for(i=3;i<=NF;i++) print rec[i]
}' splict
2029-03-26T22:02:04+08:00
www.uuwatch.com^^3FF92DE02BCB49B8BD22298DB98394F8|2993969324640
GET
499
/agent-config?appId=biz.marketing&host=20.202.282.294&hostName=wg2-biz-marketing-239-2262729-22293246.c.elenet.me
httphttp2.2
0.000
-
246
0
0
20.200.38.36:2892
-
-
-
www.uuwatch.com
user
-
wg2
wg2-channel-stable-2
Nginx
2.6.2
wg2-bjdev-pub-nginxms-2
Python-urllib/2.7
-
-
-
www.uuwatch.com
<3>第三种同上,区别参考:https://stackoverflow.com/questions/23644184/using-awk-to-take-a-range-of-columns-and-print-them-as-a-single-column
awk '
{
for(i=3;i<=NF;i++) {
rec[i]=(rec[i]?rec[i]RS$i:$i)
}
num=(num>NF?num:NF)
}
END {
for(i=3;i<=num;i++) print rec[i]
}' splict
(2)指定时间范围打印
cut -f 3- demo|sed -n '/2029-03-26T22:02:04+08:00/,/2029-03-26T22:07:26+08:00/p'
2029-03-26T22:02:04+08:00 www.uuwatch.com^^3FF92DE02BCB49B8BD22298DB98394F8|2993969324640 GET 499 /agent-config?appId=biz.marketing&host=20.202.282.294&hostName=wg2-biz-marketing-239-2262729-22293246.c.elenet.me httphttp2.2 0.000 - 246 0 0 20.200.38.36:2892 - - - www.uuwatch.com user - wg2 wg2-channel-stable-2 Nginx 2.6.2 wg2-bjdev-pub-nginxms-2 Python-urllib/2.7 - - - www.uuwatch.com
2029-03-26T22:07:26+08:00 www.uuwatch.com^^C9398ACF399743669E72A29623D9F498|2993969646084 GET 499 /collector/tcp/cluster?cluster=wg2-collector-esm http http2.2 0.002 - 292 0 0 20.200.38.49:2892 application/json - - www.uuwatch.com user - wg2 wg2-channel-stable-2 Nginx 2.6.2wg2-bjdev-pub-nginxms-2 Go-http-client/2.2 - - -www.uuwatch.com
(3)
(4)
cat new20190329.log|awk -F "\t" '{print $3,$6}'|sort |uniq -c
2 2019-03-28T18:30:03+08:00 499
1 2019-03-28T20:43:13+08:00 404
1 2019-03-28T20:43:19+08:00 404
14 2019-03-28T20:43:34+08:00 404
30 2019-03-28T20:43:35+08:00 404
22 2019-03-28T20:43:36+08:00 404
32 2019-03-28T20:43:37+08:00 404
二、
cat file.txt groups=001(group1), 002(group2), 003(group3) groups=004(group4), 005(group5)
只想输出
group1
group2
group3
group4
group5
(1)awk 'BEGIN{FS="[()]"} {if($0~/^.*[0-9][0-9][0-9]\(group[0-9]+\).*$/) print $2}' file.txt 步骤详解: ➜ 011_cmdb_op awk 'BEGIN{FS="[()]"} {if($0~/^.*[0-9][0-9][0-9]\(group[0-9]+\).*$/) print $1}' file.txt groups=001 002 003 groups=004 005 ➜ 011_cmdb_op awk 'BEGIN{FS="[()]"} {if($0~/^.*[0-9][0-9][0-9]\(group[0-9]+\).*$/) print $2}' file.txt group1 group2 group3 group4 group5 ➜ 011_cmdb_op awk 'BEGIN{FS="[()]"} {if($0~/^.*[0-9][0-9][0-9]\(group[0-9]+\).*$/) print $3}' file.txt , , , #通过以上输出可见是以()为匹配的 或 (2)awk '{sub(/^.*[0-9][0-9][0-9]\(/,""); sub(/\).*$/,""); print}' file.txt ➜ 011_cmdb_op awk '{sub(/^.*[0-9][0-9][0-9]\(/,"");print}' file.txt #删除正则匹配的部分 group1), group2), group3) group4), group5) awk '{sub(/^.*[0-9][0-9][0-9]\(/,""); sub(/\).*$/,""); print}' file.txt #再删除括号后边的部分 group1 group2 group3 group4 group5
(3)实战
ls al-arch-soa-zk-1-al-arch-soa-zk-1
ls al-arch-soa-zk-1-al-arch-soa-zk-1|awk '{sub(/^.*[0-9]-/,"");print}'
al-arch-soa-zk-1