1. re.findall() 与 re.search()
测试代码如下:
findall返回的是一个list,而search返回的是_sre.SRE_Match。
print(urls):
['{\\"count\\":18,\\"sub_images\\":[{\\"url\\":\\"http:\\\\/\\\\/p1.pstatp.com\\\\/origin\\\\/2ebe000042c272cd8ca4\\",\\"width\\":650,\\"url_list\\":[{\\"url\\":\\"http:\\\\/\\\\/p1.pstatp.com\\\\/origin\\\\/2ebe000042c272cd8ca4\\"},{\\"url\\":\\"http:\\\\/\\\\/pb3.pstatp.com\\\\/origin\\\\/2ebe000042c272cd8ca4\\"},{\\"url\\":\\"http:\\\\/\\\\/pb9.pstatp.com\\\\/origin\\\\/2ebe000042c272cd8ca4\\"}],\\"uri\\":\\"origin\\\\/2ebe000042c272cd8ca4\\",\\"height\\":975},{...},...'] (为list)
print('urls1:',urls1):
urls1: <_sre.SRE_Match object; span=(4884, 12806), match='gallery: JSON.parse("{\\"count\\":18,\\"sub_image>
提取search中的信息用,urls1 = re.search(pattern,response.text).group(1)
print(urls1):
{\"count\":18,\"sub_images\":[{\"url\":\"http:\\/\\/p1.pstatp.com\\/origin\\/2ebe000042c272cd8ca4\",\"width\":650,\"url_list\":[{\"url\":\"http:\\/\\/p1.pstatp.com\\/origin\\/2ebe000042c272cd8ca4\"},{\"url\":\"http:\\/\\/pb3.pstatp.com\\/origin\\/2ebe000042c272cd8ca4\"},{\"url\":\"http:\\/\\/pb9.pstatp.com\\/origin\\/2ebe000042c272cd8ca4\"}],\"uri\":\"origin\\/2ebe000042c272cd8ca4\",\"height\":975},{...},...,}
print(type(urls1):
2.关于去掉'\':
使用replace方法:
replace的对象为str,若用的findall,需要使用 ",".join()拼接为用 “,”分割的str,再用replace
3.MongoDB的启动
使用home-brew安装的,先用一下配置命令才能正常启动