s/ / /操作符存在变量中匹配模式的那部分内容替换成另外的字符串,其写法:s / 正则表达式 / 要替换的字符串 / 修饰符;
替换字符串:可以使用 $1、$2捕获变量等;
#(1)s///——替换正则表达式(定界符可以用任意符号,s{}{},s<><>,s###等)
$_ = "He's out bowling with Barney tonight.";
s#Barney#Fred#; #使用s###;定界符
print "$_\n"; #打印:He's out bowling with Fred tonight.
s{
with (\w+)}{
against $1's team}; #使用s{
}{
};定界符
print "$_\n"; #打印:He's out bowling against Fred's team tonight.
#(2)在替换字符串中使用$1、$2等捕获变量
$_ = "one two three";
s/(\w+) (\w+)/$2 $1/;
print "$_\n"; #打印:two one three
s/^/huge,/; #替换字符串的开头
print "$_\n"; #打印:huge,two one three
#(3)使用修饰符——/g全局替换
$_ = "home,sweet home\n";
s/home/cave/g; #默认替换一个,加上g,替换所有
print "$_\n"; #打印:cave,sweet cave
$_ = "Input data\t may have extra whitespace\n";
s/\s+/ /g; #将多余空格替换成一个空格
print "$_\n"; #打印:Input data may have extra whitespace
#(4)返回布尔值——用于if语句
$_ = " one two ";
s/^\s+//; #删除开头空格
s/\s+$//; #删除结尾空格
s/^\s+|\s+$//; #组合——删除开头和结尾的空格
print "$_\n"; #打印:one two
$_ = "fred fintstone";
if(s/fred/wilma/){
print "Successfully replaced fred with wilma!\n"; #打印:Successfully replaced fred with wilma!
print "$_\n"; #打印:wilma fintstone
}
#(5)绑定操作符——=~
$content = "one another one"; #如果不使用默认变量$_,那么就需要使用绑定操作符=~
$content =~ s/one/two/g;
print "$content\n"; #打印:two another two
#(6)替换修饰符与大小写转换——/ig
$_ = "I saw Barney with Fred.";
s/(fred|barney)/\U$1/ig; #\U——其后所有字符转换成大写
print "$_\n"; #打印:I saw BARNEY with FRED.
s/(fred|barney)/\L$1/ig; #\L——其后所有字符转换成小写
print "$_\n"; #打印:I saw barney with fred.
$_ = "this is one and two.";
s/(\w+) and (\w+)/\U$2\E and $1/; #使用捕获变量,\E——结束大写
print "$_\n"; #打印:this is TWO and one. 如不加\E, 则打印:this is TWO AND ONE.
$_ = "one and two";
s/(one|two)/\u$1/gi; #\u--第一个字母大写
print "$_\n"; #打印:One and Two
s/(one|two)/\U$1/gi; #\U--全部大写
print "$_\n"; #打印:ONE and TWO
s/(one|two)/\l$1/gi; #\l--第一个字母小写
print "$_\n"; #打印:oNE and tWO
s/(one|two)/\u\L$1/gi; #\u\L——表示“首字母大写,其余小写”
print "$_\n"; #打印:One and Two
#(7)大小写转换时的内插
$name = "bill";
print "Hello, \L\u$name\E, would you like to play a game?\n"; #内插,Hello, Bill, would you like to play a game?
split操作符:根据给定的模式拆分字符串并返回字段列表。常见的分隔模式:制表符、冒号、空白或任意符号分隔。
分 隔 规 则:默认的split会以空白符(/\s+/)分隔$_中的字符串;split会保留开头的空字段,但会舍弃结尾的空字段;
@fields = split /:/, "::abc:def::g:h:::"; #前边的冒号分隔会保留,后边的会忽略
foreach(@fields){
#@fields列表为("","","abc","def","","g","h")
print "$_\n";
}
#my $some_input = "This is a \t test.\n";
$_ = "This is a \t test.\n";
#my @args = split /\s+/, $some_input; #以空白模式分割时分割,该模式把所有连续的空白都看作单个空格并以此切分数据
@args = split; #等效:@args = split /\s+/, $_;
foreach(@args){
#@args列表为("This","is","a","test")
print "$_\n";
}
join函数:功能与split操作符相反,join函数会把若干个字符串片段合成一个字符串。连接模式:任意字符串。
join与split的区别:join的第一个参数时字符串,split的第一个参数是模式。
#join函数——合并
@y = (4, 6, 8, 10, 2); #列表
$x = join ":", @y; #不仅限于冒号,也可使用其他符号
print "$x\n"; #打印:4:6:8:10:2
在列表上下文中使用模式匹配操作符(m//)时:
注意:s///的修饰符/g也可以用在m/ /操作符上,从而让模式能够让字符串匹配到多个地方。
#列表上下文中的m//
$_ = "Hello there, neighbor!";
my($first, $second, $third) = /(\S+) (\S+), (\S+)/; #列表上下文中的m//
print "$first $second $third\n"; #打印:Hello there neighbor!
my $text = "Fred dropped a 5 ton granite block on Mr.Slate";
my @word = ($text =~ /[a-z]+/ig); #每次捕获的内容放于数组中,类似于循环,数字5被略掉
print "@word\n"; #打印:Fred dropped a ton granite block on Mr Slate
my $data = "Barney Rubble Fred Flintstone Wilma Flintstone Bill Gates";
my %last_name = ($data =~ /(\w+)\s(\w+)/g); #模式中有多组圆括号,每次匹配可以捕获多个变量——转换成哈希
while(($key,$value) = each %last_name){
print "$key => $value\n";
}
编译运行:
Wilma => Flintstone
Barney => Rubble
Bill => Gates
Fred => Flintstone
贪婪量词与非贪婪量词区别:详情见上章贪婪量词列表。
#非贪婪量词加问号?
$_ = "I thought you said Fred and Velma , not Wilma .";
s#<BOLD>(.*?)</BOLD>#$1#g; #加上问号后,只捕获到逗号位置,否则捕获到点号位置;/g——全局替换
print "$_\n"; #打印:I thought you said Fred and Velma, not Wilma.
$_ = "hellooooooooooo";
if(/(hello+)/){
#不加问号,捕获所有的o;加上问号,捕获一个o
print "$1\n"; #打印:hellooooooooooo
}
if(/(hello+?)/){
#不加问号,捕获所有的o;加上问号,捕获一个o
print "$1\n"; #打印:hello
}
正则表达式默认处理的是单行文本,但是同样也可以对多行文本进行处理。这就要用到修饰符——/m。
说明:正则表达式——/m修饰符:
/m修饰符的作用是修改 ^ 和 $ 在正则表达式中的作用,让它们分别表示行首和行尾
。#跨行的模式匹配
$_ = "this is the first line\nthis is the second line\nthis is the third line\n";
s/^this/that/mg; #/m_跨行模式(相当于把原来的一行变成三行)
print;
编译运行:
that is the first line
that is the second line
that is the third line
my $date = localtime;
$^I = "*.bak"; #将原来的文件直接修改为指定的样子,如果不加该行,则不修改原来的文件,只在终端显示修改后的文件
while(<>){
s/^Author:.*/Author: Randal L. Schwartz/; #对编译指定读取的文件进行更新
s/^Date:.*/Date: $date/; #对编译指定读取的文件进行更新
print;
}
编译运行: perl file_name.pl a.txt(可添加多个文件名,实现一次更新多个文件)
编译运行: perl file_name.pl a.txt b.txt c.txt(可添加多个文件名,实现一次更新多个文件)