python爬虫批量下载图片

使用python的urllib库和正则表达式爬取 http://pic.netbian.com/ 网站图片,支持批量下载。

1.可选择图片类型、下载那几页。

python爬虫批量下载图片_第1张图片

python爬虫批量下载图片_第2张图片

2.源代码

"""
    功能:批量下载网站图片
    时间:2019-5-18 16:14:01
    作者:倚窗听雨
"""
import urllib.request
import re
import os

headers = {
    "User-Agent":"yjs_id=0c19bb42793fd5c3c8c228f50173ca19; Hm_lvt_14b14198b6e26157b7eba06b390ab763=1529398363; __cfduid=de4421757fb00c0120063c1dbd0308e511558166501; ctrl_time=1; Hm_lvt_526caf4e20c21f06a4e9209712d6a20e=1558166930; zkhanecookieclassrecord=%2C66%2C; PHPSESSID=6bee044771973aba4aa5b989f1a3722d; zkhanmlusername=qq_%B7%E7%BE%ED%B2%D0%D4%C6186; zkhanmluserid=1380421; zkhanmlgroupid=1; zkhanmlrnd=4g6arf9678I6URIitCJ2; zkhanmlauth=1b4f3205834ec05575588adc84c0eb52; zkhandownid24031=1; Hm_lpvt_526caf4e20c21f06a4e9209712d6a20e=1558167143; security_session_verify=eb4cb5687c224e77089c64935fa16df8",
}

ur = "http://pic.netbian.com"
url_list = []

#获取各类型图片url
def picture(url):
    res = urllib.request.Request(url,data=None,headers=headers)
    html = urllib.request.urlopen(res).read().decode('gb2312')

    pic = re.findall(r'
')[-1] print('共',page,'页') start_page = int(input('下载起始页:')) end_page = int(input('下载到哪页:')) total = 0 for p in range(start_page,end_page+1): if p ==1: url2 = url + 'index'+'.html' else: url2 = url + 'index_'+str(p)+'.html' print('\n',url2) res2 = urllib.request.Request(url2, headers=headers) html2 = urllib.request.urlopen(res2).read().decode('gbk') texts =re.findall(r'

存储位置可自行修改,现在存在当前目录的img目录下

 

声明:初次学习爬虫,还有诸多不足,大神勿喷。

你可能感兴趣的:(分享)