python 正则表达式
python 3.5 都是 unicode 模式,w 可以直接匹配unicode
match
match 是从头开始匹配,是要头部相同就认为匹配,不要求匹配整个字符串
>>> import re
>>> pattern = re.compile(r'\w+\d{6}')
>>> m = pattern.match("测试201710") // 如果不能匹配到返回None,匹配到返回Match 对象
<_sre.SRE_Match object; span=(0, 8), match='测试201710'>
>>> pattern.match("#测试201710") // None
search 是如果包含匹配的字符串就符合要求
>>> pattern.search("#测试201710")
<_sre.SRE_Match object; span=(1, 9), match='测试201710'>
find
findall
>>> pattern.findall("测试201710-测试201711-测试201712")
['测试201710', '测试201711', '测试201712']
finditer
>>> text = "He was carefully disguised but captured quickly by police."
>>> for m in re.finditer(r"\w+ly", text):
... print('%02d-%02d: %s' % (m.start(), m.end(), m.group(0)))
07-16: carefully
40-47: quickly
group
group
>>> m = re.match(r"(\d+)\.(\d+)", "24.1632")
>>> m.groups()
('24', '1632')
>>>
>>> m.group(0)
'24.1632'
>>> m.group(1)
'24'
>>> m.group(2)
'1632'
groupdict
>>> m = re.match(r"(?P\w+) (?P\w+)", "Malcolm Reynolds")
>>> m.groupdict()
{'first_name': 'Malcolm', 'last_name': 'Reynolds'}