AWK,参照GNU的说法,用来从文件中选取特定记录,并在记录上做展示操作。
awk是一个工具,更是一门文字处理语言。下面咱们借助win10 subsystem for linux梳理下awk。
MicroWin10-1123:/mnt/c/Users/work/test # awk -F: '{print $1,NF,NR,FS}' /etc/passwd
root 7 1 :
bin 7 2 :
daemon 7 3 :
lp 7 4 :
mail 7 5 :
news 7 6 :
uucp 7 7 :
games 7 8 :
man 7 9 :
wwwrun 7 10 :
挨个解释下,awk会对输入的内容(当前是/etc/passwd文件)的每一行进行操作。
-F":" 指定分隔符;
$1代表行记录里按照分隔符":",分割后第一个字段,$2等以此类推。$0代表整行。
'{print $1}',为awk的action;
NF,内置变量,代表一个记录里的字段数;
NR,内置变量,代表记录数(注意因为awk是逐行做操作,所以NR是递增的)
FS,内置变量,代表字段分隔符,对应之前的-F。
试验二:常用场景
MicroWin10-1123:/mnt/c/Users/work/test # awk -F":" '{print "username:"$1"\tuid:"$3}' /etc/passwd
username:root uid:0
username:bin uid:1
username:daemon uid:2
username:lp uid:4
username:mail uid:8
MicroWin10-1123:/mnt/c/Users/work/test # awk '{if(NR>=5 && NR<=9) print NR,$0}' /etc/passwd
5 mail:x:8:12:Mailer daemon:/var/spool/clientmqueue:/bin/false
6 news:x:9:13:News system:/etc/news:/bin/bash
7 uucp:x:10:14:Unix-to-Unix CoPy system:/etc/uucp:/bin/bash
8 games:x:12:100:Games account:/var/games:/bin/bash
9 man:x:13:62:Manual pages viewer:/var/cache/man:/bin/bash
MicroWin10-1123:/mnt/c/Users/work/test # cat test.log
I am Poe,my qq is 33794712
MicroWin10-1123:/mnt/c/Users/work/test # awk -F "[, ]" '{print $3,$7}' test.log
Poe 33794712
MicroWin10-1123:/mnt/c/Users/work/test # awk '{count++;} END{print "total accounts:",count}' /etc/passwd
total accounts: 17
备注:awk的action内部对变量count做递增操作(count看来awk会初始化为0),然后END模块,会最后输出统计信息。
MicroWin10-1123:/mnt/c/Users/work/test # awk 'BEGIN{count=0;print "[start counting the account]"} {print ".";count++;} E
ND{print "[finished count the account]\n------------------------";print "total accounts:",count}' /etc/passwd
[start counting the account]
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
[finished count the account]
------------------------
total accounts: 17
备注:本版本里awk的action段,引入了BEGIN和END模块,分别输出一个表头(手工初始化count变量)、表尾(暂且这么理解),BEGIN和END之间对count变量做递增操作。
MicroWin10-1123:/mnt/c/Users/work/test # awk -F: '{if(NR==3) print "No."NR" account is",$1}' /etc/passwd
No.3 account is daemon
备注:如果NR变量(行数)为3,则输出用户名
MicroWin10-1123:/mnt/c/Users/work/test # awk 'BEGIN{count=0;} {while(count<=3) {print "count:",count;count++}}' /etc/pas
swd
count: 0
count: 1
count: 2
count: 3
一个错误的例子,因while循环体内没有用{}括起来,导致awk只做了第一个语法print,而循环体不满足退出条件,成为死循环
MicroWin10-1123:/mnt/c/Users/work/test # awk 'BEGIN{count=0;} {while(count<=3) print "count:",count;count++}' /etc/passwd
count: 0
count: 0
count: 0
count: 0
count: 0
count: 0
count: 0
count: 0
count: 0
count: 0
count: 0
count: 0
count: 0
count: 0
count: 0
count: 0
MicroWin10-1123:/mnt/c/Users/work/test # cat t1.sh
#!/usr/bin/env awk
{for (x=1;x<=4;x++){
print "loop:",x
}
}
MicroWin10-1123:/mnt/c/Users/work/test # head -1 /etc/passwd | awk -f t1.sh
loop: 1
loop: 2
loop: 3
loop: 4
备注:使用awk脚本,然后用-f选项调用脚本,脚本内部实现一个简单的for循环。awk还是根据文件记录执行action,所以如果记录数为多行,那么for循环会执行多遍。
题1:匹配所有root用户的记录
MicroWin10-1123:/mnt/c/Users/work/test # awk '/root/{print NR,$0}' /etc/passwd
1 root:x:0:0:root:/root:/bin/bash
题2:匹配第5个字段为root的记录
MicroWin10-1123:/mnt/c/Users/work/test # awk -F: '$5 ~ /root/{print $0}' /etc/passwd
root:x:0:0:root:/root:/bin/bash
题3:只输出第2行
MicroWin10-1123:/mnt/c/Users/work/test # awk 'NR==2{print NR,$0}' /etc/passwd
2 bin:x:1:1:bin:/bin:/bin/bash
MicroWin10-1123:/mnt/c/Users/work/test # head -1 /etc/passwd | awk '{city[1]="beijing";city[2]="shanghai"; for(c in city) {print city[c];}}'
beijing
shanghai
MicroWin10-1123:/mnt/c/Users/work/test # awk 'BEGIN{str="this is a test2018test!";gsub(/[0-9]+/,"!",str);print str;}'
this is a test!test!
备注:gsub(正则匹配,替换值,范围)
MicroWin10-1123:/mnt/c/Users/work/test # awk 'BEGIN{str="this is a test2018test!";print index("2019",str)?"ok":"no found"}'
no found
备注:index(查找字符,范围),返回查找字符在范围内的位置,若找不到返回0
MicroWin10-1123:/mnt/c/Users/work/test # awk 'BEGIN{str="this is a test2018test!";print match(str,/[0-9]+/)?"OK":"Not found"}'
OK
备注:match(范围,正则匹配),如果找到返回位置,否则返回0
MicroWin10-1123:/mnt/c/Users/work/test # awk 'BEGIN{str="this is a test2018test!";print substr(str,3,10);}'
is is a te
备注:substr(范围,起始位置,长度),返回字符串字串