正则表达式在脚本语言里是最重要的一部分,这部分的题目真的不容怠慢。
开始这部分的题目的解答!
>>> import re >>> pattern='[bh][aiu]t' >>> word='batsasasasa' >>> m=re.search(pattern,word) >>> if m is not None: ... m.group() ... 'bat'
pattern='[A-Za-z][a-z]+ [A-Za-z][a-z]'
>>> import re >>> pattern='([A-Z]\.)+ ?[A-Z][a-z]+' >>> s1='J.R. Smith' >>> s2='J.R.Smith' >>> s3='T. Ford' >>> re.match(pattern,s1).group() 'J.R. Smith' >>> re.match(pattern,s2).group() 'J.R.Smith' >>> re.match(pattern,s3).group() 'T. Ford'
所谓合法的Python标识符:首字母只能是下划线或字母,然后之后的字符可以是字母,数字或下划线。
>>> pattern='[a-zA-Z_][\w_]+'
>>> patter='\d+ [A-Za-z ]+'
>>> pattern='w{3}[.\w]+.com'
附加题:支持其他顶级域名:
>>> pattern='w{3}[.\w]+'
>>> pattern='\d+[Ll]?'
>>> pattern='\d+[Ll]'
>>> pattern='\d+\.\d+'
>>> pattern='\d+\.?\d+\+\d+\.?\d+j'
>>> pattern='\w+@[\w.]+'
>>> pattern='' >>> re.match(pattern,"<type 'int'>").group() "<type 'int'>" >>> re.match(pattern,"<type 'int'>").group(1) 'int'
#!/usr/bin/env python from random import randint,choice from string import lowercase from sys import maxint from time import ctime doms = ('com','edu','net','org','gov') g = open('/home/dzhwen/456.txt','a+') for i in range(randint(5,10)): dtint = randint(0,maxint-1) dtstr = ctime(dtint) shorter = randint(4,7) em ='' for j in range(shorter): em += choice(lowercase) longer = randint(shorter,12) dn='' for j in range(longer): dn += choice(lowercase) word=dtstr+'::'+em+'@'+dn+'.'+choice(doms)+'::'+str(dtint)+'-'+str(shorter)+'-'+str(longer)+'\n' g.write(word)
#!/usr/bin/env python import re f = open('/home/dzhwen/456.txt','r') pattern = '(.+?)::.+' for eachLine in f: m = re.match(pattern,eachLine) print m.group(1)
#!/usr/bin/env python import re f = open('/home/dzhwen/456.txt','r') pattern = '.+::(\w+@\w+.\w+)::.+' for eachLine in f: m = re.match(pattern,eachLine) print m.group(1)
#!/usr/bin/env python import re f = open('/home/dzhwen/456.txt','r') pattern = '\w{3} (\w{3}).+' for eachLine in f: m = re.match(pattern,eachLine) print m.group(1)
#!/usr/bin/env python import re f = open('/home/dzhwen/456.txt','r') pattern = '.+?(\d{4}).+' for eachLine in f: m = re.match(pattern,eachLine) print m.group(1)
#!/usr/bin/env python import re f = open('/home/dzhwen/456.txt','r') pattern = '.+(\d{2}:\d{2}:\d{2}).+' for eachLine in f: m = re.match(pattern,eachLine) print m.group(1)
#!/usr/bin/env python import re f = open('/home/dzhwen/456.txt','r') pattern = '.+::(\w+)?@(.+)?::.+' for eachLine in f: m = re.match(pattern,eachLine) print m.group(1),m.group(2)
#!/usr/bin/env python #-*-coding:utf-8-*- import re f = open('/home/dzhwen/456.txt','r') pattern = '.+::(.+)?::.+' for eachLine in f: m = re.match(pattern,eachLine) address = raw_input('请输入你自己的电子邮件:') print re.subn(m.group(1),address,eachLine)