awk扩展内容
- awk中使用外部shell变量
说明:-v选项用于定义参数,这里表示将变量A的值赋予GET_A。
有多少个变量需要赋值,就需要多少个-v选项。与之等价的[root@aminglinux-02 awk]# A=44 ;awk -v GET_A=$A -F ':' '{print GET_A ":" $1}' 2.txt 44:1111111 44:2222222 44:1111111 44:3333333 44:2222222
应用于脚本中:
[root@aminglinux-02 awk]# cat 2.txt 1111111:13443253456 2222222:13211222122 1111111:13643543544 3333333:12341243123 2222222:12123123123 [root@aminglinux-02 awk]# cat 1.sh #!/bin/bash sort -n 2.txt |awk -F ':' '{print $1}'|uniq >id.txt for id in `cat id.txt`; do echo "[$id]" awk -v id2=$id -F ':' '$1==id2 {print $2}' 2.txt done ##或者:awk -F ':' '$1=="id" {print $2}' 2.txt [root@aminglinux-02 awk]# sh 1.sh [1111111] 13443253456 13643543544 [2222222] 13211222122 12123123123 [3333333] 12341243123
- 如果需要把两个文件中,第一列相同的行合并到同一行中。举个例子,有两个文件,内容如下
[root@aminglinux-02 awk]# cat 1.txt 1 aa 2 bb 3 ee 4 ss [root@aminglinux-02 awk]# cat 2.txt 1 ab 2 cd 3 ad 4 bd 5 de
[root@aminglinux-02 awk]# awk 'NR==FNR{a[$1]=$2}NR>FNR{print $0,a[$1]}' 1.txt 2.txt 1 ab aa 2 cd bb 3 ad ee 4 bd ss 5 de
- 解释:NR表示读取的行数,FNR表示读取的当前行数
所以其实NR==FNR 就表示读取1.txt的时候。 同理NR>FNR表示读取2.txt的时候
数组a其实就相当于一个map - 一个文件每行都有一个数字,现在需要把每行的数字用“+”连接起来
[root@aminglinux-02 awk]# cat a 96 1093 1855 1253 1364 1332 2308 2589 2531 1239 2164 2826 2787 2145 2617 4311 1810 2115 1235 [root@aminglinux-02 awk]# awk '{printf ("%s+",$0)}' a; echo "" 96+1093+1855+1253+1364+1332+2308+2589+2531+1239+2164+2826+2787+2145+2617+4311+1810+2115+1235+ [root@aminglinux-02 awk]# cat a|xargs|sed 's/ /+/g' 96+1093+1855+1253+1364+1332+2308+2589+2531+1239+2164+2826+2787+2145+2617+4311+1810+2115+1235 这里注意,最后一个是带“+”的。echo "" 的作用是换行。
- awk中gsub函数的使用
[root@aminglinux-02 awk]# awk -F ':' 'gsub(/www/,"abc",$1){print $0}' test.txt abc x 1 1 bin /bin /sbin/nologin abcwwon x 2 2 daemon /sbin /sbin/nologin [root@aminglinux-02 awk]# awk 'gsub(/www/,"abc")' test.txt abc:x:1:1:bin:/bin:/sbin/nologin abcwwon:x:2:2:daemon:/sbin:/sbin/nologin adm:abc:3:4:adm:/var/adm:/sbin/nologin lp:abc:4:7:lp:/var/spool/lpd:/sbin/nologin
- awk 截取指定多个域为一行
for j in `seq 0 20`; do let x=100*$j let y=$x+1 let z=$x+100 for i in `seq $y $z` ; do awk -v a=$i '{printf $a " "}' example.txt >>/tmp/test.txt echo " " >>/tmp/test.txt done done
- grep 或 egrep 或awk 过滤两个或多个关键词
[root@aminglinux-02 awk]# grep -E '123|abc' test.txt abc:www:3:4:adm:/var/adm:/sbin/nologin 123:www:4:7:lp:/var/spool/lpd:/sbin/nologin [root@aminglinux-02 awk]# egrep '123|abc' test.txt abc:www:3:4:adm:/var/adm:/sbin/nologin 123:www:4:7:lp:/var/spool/lpd:/sbin/nologin [root@aminglinux-02 awk]# awk '/123|abc/' test.txt abc:www:3:4:adm:/var/adm:/sbin/nologin 123:www:4:7:lp:/var/spool/lpd:/sbin/nologin
- 用awk编写生成以下结构文件的程序。( 最后列使用现在的时间,时间格式为YYYYMMDDHHMISS) 各列的值应如下所示,每增加一行便加1,共500万行。
[root@aminglinux-02 awk]# awk 'BEGIN{for(i=1;i<=10;i++)printf("%d,%d,%010d,%010d,%010d,%010d,%010d,%010d,%d\n",i,i,i,i,i,i,i,i,strftime("%Y%m%d%H%M"))}' 1,1,0000000001,0000000001,0000000001,0000000001,0000000001,0000000001,201707212356 2,2,0000000002,0000000002,0000000002,0000000002,0000000002,0000000002,201707212356 3,3,0000000003,0000000003,0000000003,0000000003,0000000003,0000000003,201707212356 4,4,0000000004,0000000004,0000000004,0000000004,0000000004,0000000004,201707212356 5,5,0000000005,0000000005,0000000005,0000000005,0000000005,0000000005,201707212356 6,6,0000000006,0000000006,0000000006,0000000006,0000000006,0000000006,201707212356 7,7,0000000007,0000000007,0000000007,0000000007,0000000007,0000000007,201707212356 8,8,0000000008,0000000008,0000000008,0000000008,0000000008,0000000008,201707212356 9,9,0000000009,0000000009,0000000009,0000000009,0000000009,0000000009,201707212356 10,10,0000000010,0000000010,0000000010,0000000010,0000000010,0000000010,201707212356
#! /bin/bash
for i in `seq 1 5000000`; do
n=`echo "$i"|awk '{print length($0)}'`
export m=$[10-$n]
export o=`perl -e '$a='0'; $b=$a x $ENV{"m"}; print $b;'`
export j=$i
p=`perl -e '$c=$ENV{"o"}.$ENV{"j"}; print $c;'`
echo "$i,$i,$p,$p,$p,$p,$p,$p,`date +%Y%m%d%H%M%S`"
done
- awk用print打印单引号,在awk中使用脱义字符\是起不到作用的,如果想打印特殊字符,只能使用'""' 这样的组合才可以。这里自左至右为单引号 双引号 双引号 单引号其中两个单引号为一对,两个双引号为一对。想脱义$那就是'"$"' 脱义单引号
[root@aminglinux-02 awk]# awk -F ':' '{print "This is a '"'"'"$1}' test.txt This is a 'root This is a 'bin This is a 'daemon This is a 'adm This is a 'lp This is a 'sync This is a 'shutdown This is a 'halt This is a 'mail This is a 'operator This is a 'games This is a 'ftp This is a 'nobody This is a 'systemd-bus-proxy This is a 'systemd-network This is a 'dbus This is a 'polkitd This is a 'tss This is a 'postfix This is a 'sshd This is a 'chrony This is a 'aming This is a 'user1 This is a 'user3
- 把两个文件中相同的行合并成一行
[root@aminglinux-02 awk]# cat a.txt 1 2 3 4 5 6 a b c [root@aminglinux-02 awk]# cat b.txt 3 2 1 6 5 4 c b a [root@aminglinux-02 awk]# paste a.txt b.txt 1 2 3 3 2 1 4 5 6 6 5 4 a b c c b a [root@aminglinux-02 awk]# paste -d '+' a.txt b.txt 1 2 3+3 2 1 4 5 6+6 5 4 a b c+c b a [root@aminglinux-02 awk]# awk 'NR==FNR {a[FNR]=$0} NR>FNR {print a[FNR],$0}' a.txt b.txt 1 2 3 3 2 1 4 5 6 6 5 4 a b c c b a [root@aminglinux-02 awk]# awk 'NR==FNR {a[FNR]=$0} NR>FNR {print a[FNR],"+",$0}' a.txt b.txt 1 2 3 + 3 2 1 4 5 6 + 6 5 4 a b c + c b a
- awk的参考教程