参考资料:https://blog.csdn.net/u010502101/article/details/81839519
AWK, 数据过滤工具 (类似于grep,比grep强大),属数据处理引擎,基于模式匹配检查输入文本,逐行处理并输出。通常用在Shell脚本中,获取指定的数据,单独使用时,可对文本数据做统计
格式
格式1:前置命令 | awk [选项] ‘条件{编辑指令}’
格式2:awk [选项] ‘条件{编辑指令}’ 文件…
编辑指令如果包含多条语句时,可以用分号分隔,处理文本时,若未指定分隔符,则默认将空格、制表符等作为分隔符。print是最常见的指令。
选项
-F:指定分隔符,可省略(默认空格或Tab位)
-V:调用外部Shell变量 variable
awk把分割后的数据字段自动分配给数据字段变量
[root@xuniji01 ~]# cat test.txt
The dog:There is a big dog and a little dog in the park
The cat:There is a big cat and a little cat in the park
The tiger:There is a big tiger and a litle tiger in the park
通过选项-F指定“:”为字段分隔符,把每行数据分为两段,然后输出第二个数据字段$2。
[root@xuniji01 ~]# awk -F: '{print $2}' test.txt
There is a big dog and a little dog in the park
There is a big cat and a little cat in the park
There is a big tiger and a litle tiger in the park
注意:如不显示指定字段分隔符,awk的默认字段分隔符为任意空白字符,包括制表符、空格符、换行符等。
[root@xuniji01 ~]# awk -F: '{$1="Description:"; print $0}' test.txt
Description: There is a big dog and a little dog in the park
Description: There is a big cat and a little cat in the park
Description: There is a big tiger and a litle tiger in the park
[root@xuniji01 ~]# vim pokes.txt
{
$1="Description:"
print $0
}
[root@xuniji01 ~]# awk -F: -f pokes.txt test.txt
Description: There is a big dog and a little dog in the park
Description: There is a big cat and a little cat in the park
Description: There is a big tiger and a litle tiger in the park
awk默认每次读入一行数据,然后用脚本进行处理。如果想在处理文本之前预处理一些命令,可以用BEGIN关键字指定。
[root@xuniji01 ~]# awk -F: 'BEGIN{print "开始处理..."}{print $2}' test.txt
开始处理...
There is a big dog and a little dog in the park
There is a big cat and a little cat in the park
There is a big tiger and a litle tiger in the park
用END关键字在处理完所有数据后,再运行善后处理工作。
[root@xuniji01 ~]# awk -F: '{print $2} END{print "处理结束..."}' test.txt
There is a big dog and a little dog in the park
There is a big cat and a little cat in the park
There is a big tiger and a litle tiger in the park
处理结束...
变量又分为两种形式:awk内置的变量;用户自定义的变量。
[root@xuniji01 ~]# cat test.txt
The dog:There is a big dog and a little dog in the park
The cat:There is a big cat and a little cat in the park
The tiger:There is a big tiger and a litle tiger in the park
[root@xuniji01 ~]# awk 'BEGIN{FS=":"} {print $1, $2}' test.txt #用FS指定字段分隔符为“:”,然后用“:”把每行数据分割为两段。
The dog There is a big dog and a little dog in the park
The cat There is a big cat and a little cat in the park
The tiger There is a big tiger and a litle tiger in the park
用FS指定输入字段分隔符“:”后,每行数据分为两个数据段,输出时,用OFS指定两个数据字段用“>”拼接。
[root@xuniji01 ~]# cat test.txt
The dog:There is a big dog and a little dog in the park
The cat:There is a big cat and a little cat in the park
The tiger:There is a big tiger and a litle tiger in the park
[root@xuniji01 ~]# awk 'BEGIN{FS=":"; OFS=">"} {print $1, $2}' test.txt #其实就是,FS指定字段分隔符为“:”,然后将指定的分隔符替换为>
The dog>There is a big dog and a little dog in the park
The cat>There is a big cat and a little cat in the park
The tiger>There is a big tiger and a litle tiger in the park
默认情况下RS和ORS设置为“\n”,表示输入数据流中的每一行作为一条记录,输出时每条记录之间也以“\n”进行分割。
下面以a.txt文件为例,a.txt文件中内容如下:
[root@xuniji01 ~]# cat a.txt
Tom is a student
and he is 20 years old
Bob is a teacher
and he is 40 years old
默认情况下,每行作为一条记录处理,但此种情况下,要把第一行和第二行作为一条记录处理,第三行和第四行作为一条记录处理。
[root@xuniji01 ~]# awk 'BEGIN{RS=""; ORS="\n"; FS="and"; OFS=","} {print $1, $2}' a.txt
Tom is a student
, he is 20 years old
Bob is a teacher
, he is 40 years old
\n把前两行、后两行各看作一条记录来处理,然后把指定分隔符and替换为逗号。
##先查出IP地址
[root@xuniji01 ~]# ip add
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 00:15:5d:0b:f5:03 brd ff:ff:ff:ff:ff:ff
inet 10.5.6.244/24 brd 10.5.6.255 scope global noprefixroute eth0
valid_lft forever preferred_lft forever
inet6 fe80::b268:c536:3f01:4c85/64 scope link noprefixroute
valid_lft forever preferred_lft forever
##先把这一行拉出来
[root@xuniji01 ~]# ip add | grep global
inet 10.5.6.244/24 brd 10.5.6.255 scope global noprefixroute eth0
##再把以空格为分隔的第二列拉出来
[root@xuniji01 ~]# ip add | grep global | awk '{print $2}'
10.5.6.244/24
##然后把以/24为分隔符的第一列拉出来
[root@xuniji01 ~]# ip add | grep global | awk '{print $2}' | awk -F/24 '{print $1}' # 以/24为分隔符的第一列
10.5.6.244
这样就OK了。
##如果需要提取广播,提取第四列
[root@xuniji01 ~]# ip add | grep global | awk '{print $4}'
10.5.6.255
[root@xuniji01 ~]# df -h
文件系统 容量 已用 可用 已用% 挂载点
devtmpfs 904M 0 904M 0% /dev
tmpfs 915M 0 915M 0% /dev/shm
tmpfs 915M 8.5M 907M 1% /run
tmpfs 915M 0 915M 0% /sys/fs/cgroup
/dev/mapper/centos-root 50G 2.0G 48G 4% /
/dev/sda1 1014M 180M 835M 18% /boot
/dev/mapper/centos-home 74G 33M 74G 1% /home
tmpfs 183M 0 183M 0% /run/user/0
[root@xuniji01 ~]# df -h | grep home
/dev/mapper/centos-home 74G 33M 74G 1% /home
[root@xuniji01 ~]# df -h | grep home | awk '{print $4}'
74G