最近字符串的时候经常需要用到正则表达式,是时候需要学习一下正则,毕竟强大处理字符串能力,我先做了一个很简单的处理json的注释,因为在粘贴复制一段代码,总是有注释,这就用不了啊,一个个删除就是太扯淡,我才不想这个样子,用python处理一下就可好。
处理json字符串之前的代码
{ "Result":true, //标识请求是否成功 /* I am shuai */ "Detail": { "Title":"图书馆召开2014年度电子文献资源议标(询价)会", //标题 "Publisher":"图书馆", //发布者 "Date":"2014-6-25", //发布日期 "Passage":"<p><font> 为了N……" //正文 "yuan":"http://de/maxiaya" //wewewwewew } }
#coding:utf-8 import re def dealAnotation(): fp1 = open('jsonAnnotation.txt','r') fp2 = open('dealjsonAnnotation.txt','w+') for line in fp1.readlines(): string = re.sub('([^:]//.*"?$)|(/\*(.*?)\*/)','',line) fp2.write(string) fp1.close() fp2.close() if __name__ == '__main__': dealAnotation()
{ "Result":true, "Detail": { "Title":"图书馆召开2014年度电子文献资源议标(询价)会", "Publisher":"图书馆", "Date":"2014-6-25", "Passage":"<p><font> 为了N……" "yuan":"http://de/maxiaya" } }
是不是很方便啊
我在来一点python正则表达式的代码,练习
#coding:utf-8 import re strA = 'yuan520' print re.search('\d+',strA).group() strB = 'abatksjdabut' print re.search('b[aeiu]t',strB).group() strC = 'I love you' print re.match('I|love|bit',strC).group() strD = 'nobody@xxx.com' strE = 'nobody@www.xxx.com' print re.match('\w+@(\w+\.)?\w+\.com',strD).group() print re.match('\w+@(\w+\.)?\w+\.com',strE).group() strF = 'the end.' m = re.search('^the',strF) if m is not None: print m.group() m = re.search(r'\bthe','bite the dog') if m is not None: print m.group() data = 'The Feb 15 17:46:04 2007::uzifzf@dpyiviinhw.gov::11171590364-6-8' m = re.match('.*(\d-\d-\d)',data) print m.group() print m.group(1) print m.groups() data = 'The Feb 15 17:46:04 2007::uzifzf@dpyiviinhw.gov::11171590364-6-8' m = re.search('\d+-\d-\d',data) print m.group()
输出的结果
520 bat I nobody@xxx.com nobody@www.xxx.com the the The Feb 15 17:46:04 2007::uzifzf@dpyiviinhw.gov::11171590364-6-8 4-6-8 ('4-6-8',) 11171590364-6-8