WSL下学习Linux:(AWK篇)

AWK,参照GNU的说法,用来从文件中选取特定记录,并在记录上做展示操作。

awk是一个工具,更是一门文字处理语言。下面咱们借助win10 subsystem for linux梳理下awk。

工具属性

试验一:初识awk

MicroWin10-1123:/mnt/c/Users/work/test # awk -F: '{print $1,NF,NR,FS}' /etc/passwd
root 7 1 :
bin 7 2 :
daemon 7 3 :
lp 7 4 :
mail 7 5 :
news 7 6 :
uucp 7 7 :
games 7 8 :
man 7 9 :
wwwrun 7 10 :

WSL下学习Linux:(AWK篇)_第1张图片

挨个解释下,awk会对输入的内容(当前是/etc/passwd文件)的每一行进行操作。

-F":" 指定分隔符;

$1代表行记录里按照分隔符":",分割后第一个字段,$2等以此类推。$0代表整行。

'{print $1}',为awk的action;

NF,内置变量,代表一个记录里的字段数;

NR,内置变量,代表记录数(注意因为awk是逐行做操作,所以NR是递增的

FS,内置变量,代表字段分隔符,对应之前的-F。

试验二:常用场景

题1:格式化输出用户信息
MicroWin10-1123:/mnt/c/Users/work/test # awk -F":" '{print "username:"$1"\tuid:"$3}' /etc/passwd
username:root   uid:0
username:bin    uid:1
username:daemon uid:2
username:lp     uid:4
username:mail   uid:8
题2:输出第5行到第9行的数据
MicroWin10-1123:/mnt/c/Users/work/test # awk '{if(NR>=5 && NR<=9) print NR,$0}' /etc/passwd
5 mail:x:8:12:Mailer daemon:/var/spool/clientmqueue:/bin/false
6 news:x:9:13:News system:/etc/news:/bin/bash
7 uucp:x:10:14:Unix-to-Unix CoPy system:/etc/uucp:/bin/bash
8 games:x:12:100:Games account:/var/games:/bin/bash
9 man:x:13:62:Manual pages viewer:/var/cache/man:/bin/bash
题3:按照多个分隔符格式化数据(-F "[, ]",代表用逗号和空格 来分割)
MicroWin10-1123:/mnt/c/Users/work/test # cat test.log
I am Poe,my qq is 33794712
MicroWin10-1123:/mnt/c/Users/work/test # awk -F "[, ]" '{print $3,$7}' test.log
Poe 33794712

语言属性

试验一:awk编程入门:统计/etc/passwd里的用户数量

version1:
MicroWin10-1123:/mnt/c/Users/work/test # awk '{count++;} END{print "total accounts:",count}' /etc/passwd
total accounts: 17

备注:awk的action内部对变量count做递增操作(count看来awk会初始化为0),然后END模块,会最后输出统计信息。

version2:
MicroWin10-1123:/mnt/c/Users/work/test # awk 'BEGIN{count=0;print "[start counting the account]"} {print ".";count++;} E
ND{print "[finished count the account]\n------------------------";print "total accounts:",count}' /etc/passwd
[start counting the account]
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
[finished count the account]
------------------------
total accounts: 17

备注:本版本里awk的action段,引入了BEGIN和END模块,分别输出一个表头(手工初始化count变量)、表尾(暂且这么理解),BEGIN和END之间对count变量做递增操作。


试验二:awk编程进阶

Part1:IF
MicroWin10-1123:/mnt/c/Users/work/test # awk -F: '{if(NR==3) print "No."NR" account is",$1}' /etc/passwd
No.3 account is daemon

备注:如果NR变量(行数)为3,则输出用户名

Part2:while

MicroWin10-1123:/mnt/c/Users/work/test # awk 'BEGIN{count=0;} {while(count<=3) {print "count:",count;count++}}' /etc/pas
swd
count: 0
count: 1
count: 2
count: 3

一个错误的例子,因while循环体内没有用{}括起来,导致awk只做了第一个语法print,而循环体不满足退出条件,成为死循环

MicroWin10-1123:/mnt/c/Users/work/test # awk 'BEGIN{count=0;} {while(count<=3) print "count:",count;count++}' /etc/passwd
count: 0
count: 0
count: 0
count: 0
count: 0
count: 0
count: 0
count: 0
count: 0
count: 0
count: 0
count: 0
count: 0
count: 0
count: 0
count: 0

Part3:for

MicroWin10-1123:/mnt/c/Users/work/test # cat t1.sh
#!/usr/bin/env awk
{for (x=1;x<=4;x++){
        print "loop:",x
        }
}
MicroWin10-1123:/mnt/c/Users/work/test # head -1 /etc/passwd | awk -f t1.sh
loop: 1
loop: 2
loop: 3
loop: 4

备注:使用awk脚本,然后用-f选项调用脚本,脚本内部实现一个简单的for循环。awk还是根据文件记录执行action,所以如果记录数为多行,那么for循环会执行多遍。

Part4:正则匹配

题1:匹配所有root用户的记录

MicroWin10-1123:/mnt/c/Users/work/test # awk '/root/{print NR,$0}' /etc/passwd
1 root:x:0:0:root:/root:/bin/bash

题2:匹配第5个字段为root的记录

MicroWin10-1123:/mnt/c/Users/work/test # awk -F: '$5 ~ /root/{print $0}' /etc/passwd
root:x:0:0:root:/root:/bin/bash

题3:只输出第2行

MicroWin10-1123:/mnt/c/Users/work/test # awk 'NR==2{print NR,$0}' /etc/passwd
2 bin:x:1:1:bin:/bin:/bin/bash

Part5:数组

MicroWin10-1123:/mnt/c/Users/work/test # head -1 /etc/passwd | awk '{city[1]="beijing";city[2]="shanghai"; for(c in city) {print city[c];}}'
beijing
shanghai

Part6:字符串

MicroWin10-1123:/mnt/c/Users/work/test # awk 'BEGIN{str="this is a test2018test!";gsub(/[0-9]+/,"!",str);print str;}'
this is a test!test!

备注:gsub(正则匹配,替换值,范围)

MicroWin10-1123:/mnt/c/Users/work/test # awk 'BEGIN{str="this is a test2018test!";print index("2019",str)?"ok":"no found"}'
no found

备注:index(查找字符,范围),返回查找字符在范围内的位置,若找不到返回0

MicroWin10-1123:/mnt/c/Users/work/test # awk 'BEGIN{str="this is a test2018test!";print match(str,/[0-9]+/)?"OK":"Not found"}'
OK

备注:match(范围,正则匹配),如果找到返回位置,否则返回0

MicroWin10-1123:/mnt/c/Users/work/test # awk 'BEGIN{str="this is a test2018test!";print substr(str,3,10);}'
is is a te

备注:substr(范围,起始位置,长度),返回字符串字串






你可能感兴趣的:(WSL下学习Linux:(AWK篇))