Learning Perl学习笔记(8)第九章 Processing Text with Regular Expressions

本章主要内容是利用正则表达式进行文本的处理。本篇笔记示例内容均摘自Learning Perl第七版英文原版。

(一)使用s///进行替换

格式一般是:

s///

比如:

$_ = "He's out bowling with Barney tonight.";
s/Barney/Fred/; # 将文本里的字符串里的Barney替换成Fred
print "$_\n";
(1)使用/g进行全局替换

默认情况下,s///只进行一次替换。使用/g告诉s///替换所有非重叠的字符串:

$_ = "home, sweet home!";
s/home/cave/g;
print "$_\n";

结果:

cave, sweet cave! #把所有的home替换成了cave

也可以进行非字符的替换:

$_ = "Input data\t may have extra whitespace.";
s/\s+/ /g; #把上面字符串里的制表符以及后面的空格替换成一个空格

结果应是:

Input data may have extra whitespace.
(2)不同的分隔符

除了用/来作为分隔符,也可以用井号,或者括号,比如:

s#\Ahttps://#http://#;
s{fred}{barney};
s[fred] (barney);
s#barney#;

(3)大小写转化

使用\U将所有字符变成大写:

$_ = "I saw Barney with Fred.";
s/(fred|barney)/\U$1/gi; # 结果应该是"I saw BARNEY with FRED.",这里的i代表在匹配的时候大小写不敏感

使用\L将所有大写字母变小写:

s/(fred|barney)/\L$1/gi; # 结果为"I saw barney with fred."

只将其中一部分字符更改大小写:

s/(\w+) with (\w+)/\U$2\E with $1/i; # 结果为 "I saw FRED with barney."
#这里注意,只把前面匹配的第二个字符串fred进行大写

全部小写,除了首字母要大写:

s/(fred|barney)/\u\L$1/ig; # 结果为"I saw Fred with Barney."

(二)split分割符

格式一般为:

my @fields = split /separator/, $string;

根据冒号进行分割:

my @fields = split /:/, "abc:def:g:h"; # gives ("abc", "def", "g", "h")
my @fields = split /:/, "abc:def::g:h"; # gives ("abc", "def", "", "g", "h")

有一个规定,当分割的前几列是空白的时候,结果也会显示空白,而后几列空白则会被丢弃:

my @fields = split /:/, ":::a:b:c:::"; # gives ("", "", "", "a", "b", "c") #c后面的被舍弃了

如果想保留后面的空白,可以用加一个参数-1:

my @fields = split /:/, ":::a:b:c:::", -1; # gives("", "", "", "a", "b", "c", "", "", "")

以空格为分隔符,无论有几个空格,都按照一个算:

my $some_input = "This    is a \t                    test.\n";
my @args = split /\s+/, $some_input; # ("This", "is", "a", "test.")

(三)join功能

使用格式:

my $result = join $glue, @pieces;

比如说:

my $x = join ":", 4, 6, 8, 10, 12; # $x is "4:6:8:10:12"

(四)word边界符

前两章里讲到了\b是word的边界匹配符,但是它不能区分word里面的符号,比如doesn't。如果使用\b,会出现这样的情况:

my $string = "This doesn't capitalize correctly.";
$string =~ s/\b(\w)/\U$1/g;
print "$string\n";

得到这样的结果:

This Doesn'T Capitalize Correctly. #把t也当成了一个独立的word

如何避免这样的情况发生?使用\b{wb}:

my $string = "this doesn't capitalize correctly.";
$string =~ s/\b{wb}(\w)/\U$1/g;
print "$string\n";

结果是:

This Doesn't Capitalize Correctly.

你可能感兴趣的:(Learning Perl学习笔记(8)第九章 Processing Text with Regular Expressions)