python连续爬取多个网页的图片分别保存到不同的文件夹
作者:vpoet
mail:[email protected]
1 #coding:utf-8 2 import urllib 3 import urllib2 4 import re 5 6 7 # 将正则表达式编译成Pattern对象 8 rex=r'src="(http://imgsrc.baidu.com/forum/w%3D580.*?\.jpg)"'; 9 pages = ('1','2'); 10 11 for page in pages: 12 pageurl = "http://tieba.baidu.com/p/3710495592?pn="+page; 13 Response=urllib2.urlopen(pageurl); 14 Html=Response.read(); 15 lists = re.findall(rex, Html); 16 lensofpage=len(lists); 17 print lensofpage; 18 19 picname = 'pic' + page; 20 print picname; 21 x=1; 22 for picurl in lists: 23 urllib.urlretrieve(picurl,'C:\Users\Administrator\Desktop\%s\%s.jpg' % (picname,x)); 24 print page+picurl; 25 x=x+1; 26 27 28 29 print 'DownLoadPicOver' 30 # 图片存储路径:C:\Users\Administrator\Desktop\pic1 31 # C:\Users\Administrator\Desktop\pic2 32 #测试爬取网址:http://tieba.baidu.com/p/3710495592?pn=1 33 # http://tieba.baidu.com/p/3710495592?pn=2
运行截图: