Regular Expressions

 

For example, you could write a pattern that matches a string containing the text ``Perl'' or the text ``Python'' using the following regular expression.

/Perl|Python/

 

The forward slashes delimit the pattern, which consists of the two things we're matching, separated by a pipe character (``|''). You can use parentheses within patterns, just as you can in arithmetic expressions, so you could also have written this pattern as

/P(erl|ython)/

 

You can also specify repetition within patterns. /ab+c/ matches a string containing an ``a'' followed by one or more ``b''s, followed by a ``c''. Change the plus to an asterisk, and /ab*c/ creates a regular expression that matches an ``a'', zero or more ``b''s, and a ``c''.

 

You can also match one of a group of characters within a pattern. Some common examples are character classes such as ``\s'', which matches a whitespace character (space, tab, newline, and so on), ``\d'', which matches any digit, and ``\w'', which matches any character that may appear in a typical word. The single character ``.'' (a period) matches any character.

 

We can put all this together to produce some useful regular expressions.

/\d\d:\d\d:\d\d/     # a time such as 12:34:56

/Perl.*Python/       # Perl, zero or more other chars, then Python

/Perl\s+Python/      # Perl, one or more spaces, then Python

/Ruby (Perl|Python)/ # Ruby, a space, and either Perl or Python

 

Once you have created a pattern, it seems a shame not to use it. The match operator ``=~'' can be used to match a string against a regular expression. If the pattern is found in the string, =~ returns its starting position, otherwise it returns nil. This means you can use regular expressions as the condition in if and while statements. For example, the following code fragment writes a message if a string contains the text 'Perl' or 'Python'.

if line =~ /Perl|Python/

  puts "Scripting language mentioned: #{line}"

end

 

The part of a string matched by a regular expression can also be replaced with different text using one of Ruby's substitution methods.

line.sub(/Perl/, 'Ruby')    # replace first 'Perl' with 'Ruby'

line.gsub(/Python/, 'Ruby') # replace every 'Python' with 'Ruby'

 

 

 

 

你可能感兴趣的:(Regular Expressions)