/pattern/和%r{pattern}隐式创建,%r构造体是常规分隔输入的一种形式
/pattern/ /pattern/options %r{pattern} %r{pattern}options Regexp.new('pattern',[, options])正规表达示选项(options):
def show_regexp(a, re) if a=~ re "#{$`}<<#{$&}>>#{$'}" else "no match" end end show_regexp('very interesting', /t/) -> very in<<t>>resting show_regexp("The ruby program", /the/) -> no match show_regexp("The ruby program", /the/i) -> <<The>> ruby program show_regexp("abc\n123", /.+/) -> <<abc>>\n123 show_regexp("abc\n123", /.+/m) -> <<abc\n123>>正则表达式模式:
show_regexp("this is\n the time", /^the/) -> this is \n<<the>> time show_regexp("this is\n the time", /is$/) -> this <<is>>\nthe time show_regexp("this is\n the time", /\Athis/) -> <<this>> is \nthe time show_regexp("this is\n the time", /\Athe/) -> no match show_regexp("this is\n the time", /\bis/) -> this <<is>>\nthe time show_regexp("this is\n the time", /\Bis/) -> th<<is>> is\nthe time
POSIX字符类 | |
[:alnum:] | 字母和数字 |
[:alpha:] | 大写或小写字母 |
[:blank:] | 空格和制表符 |
[:cntrl:] | 控制字符(至少0x00-0x1f, 0x7f) |
[:digit:] | 数字 |
[:graph:] | 除了空格的可打印字符 |
[:lower:] | 小写字符 |
[:print:] | 任何可打印字符(包括空格) |
[:punct:] | 除了空格和字母数字的可打印字符 |
[:space:] | 空格(等同于/s) |
[:upper:] | 大写字母 |
[:xdigit:] | 16进制数字(0-9,a-f,A-F) |
show_regexp('Price $12.', [[:digit:]]) -> Price $<<1>>12 show_regexp('Price $12.', [[:punct:]aeiou]) -> Pr<<i>>ce $12\d \s \w 分别匹配数字、空格和词的字符类的缩写形式, /w相当于[A-Za-z0-9]
a= "The moon is made of cheese" show_regexp(a, /\w+/) -><<The>> moon is made of cheese show_regexp(a, /\s.*\s/) ->The<< moon is made of >>cheese show_regexp(a, /\s.*?\s/) ->The<< moon >>is made of cheese show_regexp(a, /[aeiou]{2, 99}/) ->The m<<oo>>n is made of cheese show_regexp(a, /mo?o/) ->The <<moo>> is made of cheesrel|rel2 匹配"re"或者"re2",|的优先级很低
a= "red ball blue sky" show_regexp(a, /blue|red/) -> <<red>> ball blue sky show_regexp(a, /red ball|angry sky/) -> <<red ball>> blue sky由于|的优先级很低故最后一个匹配的是red ball或angry sky,而不会是red ball sky或red angry sky,可以通过()来改变
show_regexp('banana', /an*/) -> b<<an>>ana show_regexp('banana', /(an)*/) -> <<>>banana show_regexp('banana', /(an)+/) -> b<<anan>>a
使用()的组份特性
#则相当于匹配重复的组词 show_regexp('He said "Hello"', /(\w)\1/) ->He said "He<<ll>>o" show_regexp('Mississippi', /(\w)\1) ->M<<ississ>>ippi #匹配分界符 show_regexp('He said "Hello"', /(["']).*?\1/) ->He said <<"Hello">>/G用在全局匹配方法String#gusb、String#gsub!、String#index 和Strng#scan中,在重复匹配中,它代表字符串内迭代中最后一具匹配结束的位置,开始时\G指向字符串开始的位置,或者String#index方法中指定的第二个参数引用的字符
"a01b23c45 d56".scan(/[a-z]\d+/) -> ["a01", "b23", "c45", "d56"] "a01b23c45 d56".scan(/\G[a-z]\d+/) -> ["a01", "b23", "c45"]#{} 像在字符串那样,执行表达示替换。默认情况,每次求解正则表达式时都执行该替,如果设置了/o选项,那么仅在第一次求解时执行替换
date= "12/25/01" date=~ %r{(\d+)(/|:)(\d+)(/|:)(\d+)} [$1, $2, $3, $4, $5] -> ["12", "/", "25", "/", "01"] date=~ r{(\d+)(?:/|:)(\d+)(?:/|:)(\d+)} [$1, $2, $3] -> ["12", "25", "01"](?=re)
str="red, white, and blue" str.scan(/[a-z]+(?=,)/) -> ["red", "withe"](?!re)
在当前锚点的第一个正规表达式中嵌套一个独立正规表达式,被其耗用的字符就不会被上层正则表达式访问,这个结构可以抑制回溯,从而提高性能
str = "a" + ("b"*5000) str =~ /^a(?>.*b).*a/ #则此匹配的模式查找性能比下面的好 str = /^a.*b.*a/(?imx)