awk笔记231129

awk的脚本套路是:
awk的脚步部分最好用一对单引号将
‘BEGIN{} /pattern1/{}…/patternN/{} END{}’ 套起来,
因为常用到$号, $号在单引号中不会被转义, 在双引号中有取值的含义

awk -F '自定义分隔符' `BEGIN{开始块}
/pattern1/{操作pattern1过滤的行的块}
/pattern2/{操作pattern2过滤的行的块}
...
/patternN/{操作patternN过滤的行的块}
END{结束块}

-F指定分隔符,可以没有,默认是空格
BEGIN{开始块}可以没有
END{结束块}可以没有
BEGIN,END 必须全大写,否则不起效

echo "1 aaa aaa aaa
2 aaa aaa aaa
3 aaa aaa aaa
4 aaa aaa aaa
5 aaa aaa aaa
6 aaa aaa aaa
7 aaa aaa aaa"  | awk   '/[135]/{sub("aaa","b&b",$0);  print $0}'

结果

1 baaab aaa aaa
3 baaab aaa aaa
5 baaab aaa aaa

与上面的区别只是 sub改为gsub

echo "1 aaa aaa aaa
2 aaa aaa aaa
3 aaa aaa aaa
4 aaa aaa aaa
5 aaa aaa aaa
6 aaa aaa aaa
7 aaa aaa aaa"  | awk   '/[135]/{gsub("aaa","b&b",$0);  print $0}'

结果:

1 baaab baaab baaab
3 baaab baaab baaab
5 baaab baaab baaab

原样实例

[z@fedora root]$ echo "1 aaa aaa aaa
2 aaa aaa aaa
3 aaa aaa aaa
4 aaa aaa aaa
5 aaa aaa aaa
6 aaa aaa aaa
7 aaa aaa aaa"  | awk   '/[135]/{sub("aaa","b&b",$0);  print $0}'
1 baaab aaa aaa
3 baaab aaa aaa
5 baaab aaa aaa
[z@fedora root]$ echo "1 aaa aaa aaa
2 aaa aaa aaa
3 aaa aaa aaa
4 aaa aaa aaa
5 aaa aaa aaa
6 aaa aaa aaa
7 aaa aaa aaa"  | awk   '/[135]/{gsub("aaa","b&b",$0);  print $0}'
1 baaab baaab baaab
3 baaab baaab baaab
5 baaab baaab baaab

讲解

# /[135]/ 筛选出含1,3,5的行
# sub和gub是替换函数, sub替换每行的第一个匹配, gsub替换每行的所有匹配
# b&b表示给匹配的结果左右加上b字母, &代表匹配的字段
awk   '/[135]/{sub("aaa","b&b",$0);  print $0}'
awk   '/[135]/{gsub("aaa","b&b",$0);  print $0}'



例2
echo "0 aaa aaa aaa
1 aaa aaa aaa
2 aaa aaa aaa
3 aaa aaa aaa
4 aaa aaa aaa
5 aaa aaa aaa
6 aaa aaa aaa
7 aaa aaa aaa
8 aaa aaa aaa
9 aaa aaa aaa"  | 
awk   'BEGIN{print "这是开始块"  } /[135]/{gsub("aaa","b&b",$0);  print $0} /[234]/{print $0} END{print "这是结束块" }'

上下是一样的,单双引号未结束时可换行, 管道符|后可换行

echo "0 aaa aaa aaa
1 aaa aaa aaa
2 aaa aaa aaa
3 aaa aaa aaa
4 aaa aaa aaa
5 aaa aaa aaa
6 aaa aaa aaa
7 aaa aaa aaa
8 aaa aaa aaa
9 aaa aaa aaa"  | 
awk   'BEGIN{print "这是开始块"        } 
/[135]/{gsub("aaa","b&b",$0);  print $0} 
/[234]/{print $0                       } 
END{print "这是结束块"                 }'

结果

这是开始块
1 baaab baaab baaab
2 aaa aaa aaa
3 baaab baaab baaab
3 baaab baaab baaab
4 aaa aaa aaa
5 baaab baaab baaab
这是结束块

控制台原样

[z@fedora root]$ echo "0 aaa aaa aaa
1 aaa aaa aaa
2 aaa aaa aaa
3 aaa aaa aaa
4 aaa aaa aaa
5 aaa aaa aaa
6 aaa aaa aaa
7 aaa aaa aaa
8 aaa aaa aaa
9 aaa aaa aaa"  |
awk   'BEGIN{print "这是开始块"        }
/[135]/{gsub("aaa","b&b",$0);  print $0}
/[234]/{print $0                       }
END{print "这是结束块"                 }'
这是开始块
1 baaab baaab baaab
2 aaa aaa aaa
3 baaab baaab baaab
3 baaab baaab baaab
4 aaa aaa aaa
5 baaab baaab baaab
这是结束块

可看出, 第三个块输入的数据会受第二个块的影响
第三块筛选包含2,3,4的行,第二块筛选包含1,3,5的行,
第3行是共选,所以出现两次,被第二块改了,第三块什么都不做,输出第二块的修改的样子
第2,4行没有被第二块筛选,保持原态,被第三块筛选输出.

例3
echo "0 aaa aaa aaa
1 aaa aaa aaa
2 aaa aaa aaa
3 aaa aaa aaa
4 aaa aaa aaa
5 aaa aaa aaa
6 aaa aaa aaa
7 aaa aaa aaa
8 aaa aaa aaa
9 aaa aaa aaa"  | 
awk   'BEGIN{print "这是开始块"               } 
/[135]/{gsub("aaa","b&b",$0);  print $0       } 
/[234]/{print $0                              } 
/[1579]/{sub("aaa","1579&1579",$0);  print $0 } 
END{print "这是结束块"                        }'

结果

这是开始块
1 baaab baaab baaab
1 b1579aaa1579b baaab baaab
2 aaa aaa aaa
3 baaab baaab baaab
3 baaab baaab baaab
4 aaa aaa aaa
5 baaab baaab baaab
5 b1579aaa1579b baaab baaab
7 1579aaa1579 aaa aaa
9 1579aaa1579 aaa aaa
这是结束块

控制台原样

[z@fedora root]$ echo "0 aaa aaa aaa
1 aaa aaa aaa
2 aaa aaa aaa
3 aaa aaa aaa
4 aaa aaa aaa
5 aaa aaa aaa
6 aaa aaa aaa
7 aaa aaa aaa
8 aaa aaa aaa
9 aaa aaa aaa"  |
awk   'BEGIN{print "这是开始块"               }
/[135]/{gsub("aaa","b&b",$0);  print $0       }
/[234]/{print $0                              }
/[1579]/{sub("aaa","1579&1579",$0);  print $0 }
END{print "这是结束块"                        }'
这是开始块
1 baaab baaab baaab
1 b1579aaa1579b baaab baaab
2 aaa aaa aaa
3 baaab baaab baaab
3 baaab baaab baaab
4 aaa aaa aaa
5 baaab baaab baaab
5 b1579aaa1579b baaab baaab
7 1579aaa1579 aaa aaa
9 1579aaa1579 aaa aaa
这是结束块




发现菜鸟教程的解释挺好的,和我理解的一样 点击跳转 AWK 工作原理

你可能感兴趣的:(文本,正则RegExp,text,linux,bash)