Python3爬虫批量爬取图片并保存到本地

看新闻的时候忽然发现了一个图片网站,那肯定得爬一下。

网址:https://www.0xu.cn/
不难发现,qcmn这个路径对应青春美女
Python3爬虫批量爬取图片并保存到本地_第1张图片

右键检查图片地址可见
Python3爬虫批量爬取图片并保存到本地_第2张图片
访问该地址成功访问到了图片

Python3爬虫批量爬取图片并保存到本地_第3张图片

正式开始
第一步:请求网页并分析返回包提取图片url地址。

检查发现qcmn第一张图片对应路径3087
Python3爬虫批量爬取图片并保存到本地_第4张图片

右键检查network搜索对应请求
Python3爬虫批量爬取图片并保存到本地_第5张图片
发现返回包是一段json
Python3爬虫批量爬取图片并保存到本地_第6张图片

一、先写一个获取URL的函数

import requests
import json
import re

page=1
path='qcmn'
url='https://www.0xu.cn/gallery/'+path
url='https://www.0xu.cn/gallery/list?page='+str(page) +'&category='+path

def get_html(url):
    html = requests.get(url)
    html=html.text

    print(html)

返回结果:
在这里插入图片描述

{"response":{"status":1,"message":"请求成功","data":{"page":1,"page_size":10,"totalPage":56,"list":[{"id":3807,"uid":"385c1458d707bd373cc47048f7f5be22","title":"林若:又回到这座灯火阑珊的旧城,只是再也没有你陪我的黄昏。","summary":"情绪低落时,现实的你,网络上的你,一个假装快乐,一个真心难过。","state":0,"browse":17,"created":"2020-12-22T03:03:08+08:00","classify":{"id":2,"name":"清纯美女","slug":"qcmn"},"author":{"uid":"385c1458d707bd373cc47048f7f5be22","author_name":"kstest","author_avatar":"https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg","login_name":"kstest","phone":"17711067850","email":"","create_time":1516856296,"flag":0,"description":""},"pictures":[{"id":9171,"img_url":"https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/1a2ebb29df8634ea18fab0438aef067c.jpg"},{"id":9172,"img_url":"https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/48260abe57dabd163a824eb0de8f540d.jpg"}],"tags":null},{"id":3806,"uid":"385c1458d707bd373cc47048f7f5be22","title":"林若:在某个时刻(存照) ​​​​","summary":"人总是这样,终于到了懂得珍惜的年纪,却偏偏什么都走散了。","state":0,"browse":9,"created":"2020-12-22T03:02:23+08:00","classify":{"id":2,"name":"清纯美女","slug":"qcmn"},"author":{"uid":"385c1458d707bd373cc47048f7f5be22","author_name":"kstest","author_avatar":"https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg","login_name":"kstest","phone":"17711067850","email":"","create_time":1516856296,"flag":0,"description":""},"pictures":[{"id":9170,"img_url":"https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/182ea3276e49c000284f914dfe5ee4de.jpg"}],"tags":null},{"id":3805,"uid":"385c1458d707bd373cc47048f7f5be22","title":"林若:他知道你舍不得离开才肆无忌惮的伤害。","summary":"这一生,这一世,因为不再有你,所以爱情轰然老去。","state":0,"browse":5,"created":"2020-12-22T03:01:50+08:00","classify":{"id":2,"name":"清纯美女","slug":"qcmn"},"author":{"uid":"385c1458d707bd373cc47048f7f5be22","author_name":"kstest","author_avatar":"https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg","login_name":"kstest","phone":"17711067850","email":"","create_time":1516856296,"flag":0,"description":""},"pictures":[{"id":9169,"img_url":"https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/09b79f16080a773eb25b207c76c2eca6.jpg"}],"tags":null},{"id":3804,"uid":"385c1458d707bd373cc47048f7f5be22","title":"林若Abby:年头觉得自己还蛮好看的 年尾怎么就长残了","summary":"一个人躲在角落里,悄悄的落泪,因为没有人值得我倾诉。","state":0,"browse":6,"created":"2020-12-22T03:01:22+08:00","classify":{"id":2,"name":"清纯美女","slug":"qcmn"},"author":{"uid":"385c1458d707bd373cc47048f7f5be22","author_name":"kstest","author_avatar":"https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg","login_name":"kstest","phone":"17711067850","email":"","create_time":1516856296,"flag":0,"description":""},"pictures":[{"id":9168,"img_url":"https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/8e190cb74b4b0f0ba526935b17aee9d8.jpg"}],"tags":null},{"id":3803,"uid":"385c1458d707bd373cc47048f7f5be22","title":"林若:怀念半年前傻傻憨憨 快快乐乐的自己 ​​​​","summary":"有裂痕的愛怎麽重蓋、悲傷要怎麽平靜純白。","state":0,"browse":7,"created":"2020-12-22T03:00:07+08:00","classify":{"id":2,"name":"清纯美女","slug":"qcmn"},"author":{"uid":"385c1458d707bd373cc47048f7f5be22","author_name":"kstest","author_avatar":"https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg","login_name":"kstest","phone":"17711067850","email":"","create_time":1516856296,"flag":0,"description":""},"pictures":[{"id":9167,"img_url":"https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/91b162607c9d204916f3d0d803ffcb75.jpg"}],"tags":null},{"id":3800,"uid":"385c1458d707bd373cc47048f7f5be22","title":"杏子大人:情书也撕了,酒杯也碎了,别担心,你走吧,我不爱你了","summary":"很多事情没有来日方长,很多人都只会乍然离场。一组伤感的个性签名送给你们,希望伤心难过时能带给你一丝慰藉。","state":0,"browse":7,"created":"2020-12-22T02:57:09+08:00","classify":{"id":2,"name":"清纯美女","slug":"qcmn"},"author":{"uid":"385c1458d707bd373cc47048f7f5be22","author_name":"kstest","author_avatar":"https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg","login_name":"kstest","phone":"17711067850","email":"","create_time":1516856296,"flag":0,"description":""},"pictures":[{"id":9164,"img_url":"https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/b7ef615e0c03c602095d0092d370ba5c.jpg"}],"tags":null},{"id":3799,"uid":"385c1458d707bd373cc47048f7f5be22","title":"杏子大人:微笑就像创可贴,掩饰了伤口,痛还在。","summary":"你的名字,写下来不过几厘米那么短,却贯穿了我那么长的时光。","state":0,"browse":9,"created":"2020-12-22T02:56:24+08:00","classify":{"id":2,"name":"清纯美女","slug":"qcmn"},"author":{"uid":"385c1458d707bd373cc47048f7f5be22","author_name":"kstest","author_avatar":"https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg","login_name":"kstest","phone":"17711067850","email":"","create_time":1516856296,"flag":0,"description":""},"pictures":[{"id":9162,"img_url":"https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/7a04b64835cdb0e8a666caec25bb64d3.jpg"},{"id":9163,"img_url":"https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/74af547d45e9e329497cd10272608686.jpg"}],"tags":null},{"id":3793,"uid":"385c1458d707bd373cc47048f7f5be22","title":"杏子大人:更可笑的是我瞒着所有人继续爱了你好久。","summary":"当身边结婚的朋友越来越多,当朋友圈晒娃的越来越多,有时候想想,大概是等不到你了。","state":0,"browse":2,"created":"2020-12-22T02:51:21+08:00","classify":{"id":2,"name":"清纯美女","slug":"qcmn"},"author":{"uid":"385c1458d707bd373cc47048f7f5be22","author_name":"kstest","author_avatar":"https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg","login_name":"kstest","phone":"17711067850","email":"","create_time":1516856296,"flag":0,"description":""},"pictures":[{"id":9147,"img_url":"https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/7000fb6c54306bc2cd46bc593582fe70.jpg"},{"id":9148,"img_url":"https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/7e59bcd7d53ab6fe7dfc11cc58d7cb51.jpg"},{"id":9149,"img_url":"https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/ffa75d59348399f5495fbb98eae8d12d.jpg"}],"tags":null},{"id":3792,"uid":"385c1458d707bd373cc47048f7f5be22","title":"杏子大人:不敢再炫耀身边有谁,害怕你突然间离开让我尴尬。","summary":"可不可以不要靠近我,了解我,心疼我,然后再离开我。","state":0,"browse":3,"created":"2020-12-22T02:50:06+08:00","classify":{"id":2,"name":"清纯美女","slug":"qcmn"},"author":{"uid":"385c1458d707bd373cc47048f7f5be22","author_name":"kstest","author_avatar":"https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg","login_name":"kstest","phone":"17711067850","email":"","create_time":1516856296,"flag":0,"description":""},"pictures":[{"id":9145,"img_url":"https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/7a0c2d7c44d85a7beb9d0e553c051d2d.jpg"},{"id":9146,"img_url":"https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/6fd1dc7a272476d1dafba1d266e65c50.jpg"}],"tags":null},{"id":3791,"uid":"385c1458d707bd373cc47048f7f5be22","title":"就是LYN:时光让我们相聚,时光却也让我们分离。","summary":"我走不进你的心,写不出你的梦,不管我付出多少都是个外人。","state":0,"browse":5,"created":"2020-12-22T02:49:14+08:00","classify":{"id":2,"name":"清纯美女","slug":"qcmn"},"author":{"uid":"385c1458d707bd373cc47048f7f5be22","author_name":"kstest","author_avatar":"https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg","login_name":"kstest","phone":"17711067850","email":"","create_time":1516856296,"flag":0,"description":""},"pictures":[{"id":9143,"img_url":"https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/f863993340a222f2911d8c039ae1f58a.jpg"}],"tags":null}]}}}

这是一段json为了方便观看,可以找一个json在线解析网站
Python3爬虫批量爬取图片并保存到本地_第7张图片
我们需要的内容都在data内
接下来要掐头去尾获取我们需要的内容并将json转换为Python字典

import requests
import json
import re

page=1
path='qcmn'
url='https://www.0xu.cn/gallery/'+path
url='https://www.0xu.cn/gallery/list?page='+str(page) +'&category='+path

def get_html(url):
    html = requests.get(url)
    html=html.text
    #掐头去尾
    html = html.replace('{"response":{"status":1,"message":"请求成功","data":', '').replace('}}', '')
    # 将 JSON 对象转换为 Python 字典
    html = json.loads(html)
    print(type(html))
    print(html)


get_html(url)

运行结果:

{'page': 1, 'page_size': 10, 'totalPage': 56, 'list': [{'id': 3807, 'uid': '385c1458d707bd373cc47048f7f5be22', 'title': '林若:又回到这座灯火阑珊的旧城,只是再也没有你陪我的黄昏。', 'summary': '情绪低落时,现实的你,网络上的你,一个假装快乐,一个真心难过。', 'state': 0, 'browse': 17, 'created': '2020-12-22T03:03:08+08:00', 'classify': {'id': 2, 'name': '清纯美女', 'slug': 'qcmn'}, 'author': {'uid': '385c1458d707bd373cc47048f7f5be22', 'author_name': 'kstest', 'author_avatar': 'https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg', 'login_name': 'kstest', 'phone': '17711067850', 'email': '', 'create_time': 1516856296, 'flag': 0, 'description': ''}, 'pictures': [{'id': 9171, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/1a2ebb29df8634ea18fab0438aef067c.jpg'}, {'id': 9172, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/48260abe57dabd163a824eb0de8f540d.jpg'}], 'tags': None}, {'id': 3806, 'uid': '385c1458d707bd373cc47048f7f5be22', 'title': '林若:在某个时刻(存照) \u200b\u200b\u200b\u200b', 'summary': '人总是这样,终于到了懂得珍惜的年纪,却偏偏什么都走散了。', 'state': 0, 'browse': 9, 'created': '2020-12-22T03:02:23+08:00', 'classify': {'id': 2, 'name': '清纯美女', 'slug': 'qcmn'}, 'author': {'uid': '385c1458d707bd373cc47048f7f5be22', 'author_name': 'kstest', 'author_avatar': 'https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg', 'login_name': 'kstest', 'phone': '17711067850', 'email': '', 'create_time': 1516856296, 'flag': 0, 'description': ''}, 'pictures': [{'id': 9170, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/182ea3276e49c000284f914dfe5ee4de.jpg'}], 'tags': None}, {'id': 3805, 'uid': '385c1458d707bd373cc47048f7f5be22', 'title': '林若:他知道你舍不得离开才肆无忌惮的伤害。', 'summary': '这一生,这一世,因为不再有你,所以爱情轰然老去。', 'state': 0, 'browse': 5, 'created': '2020-12-22T03:01:50+08:00', 'classify': {'id': 2, 'name': '清纯美女', 'slug': 'qcmn'}, 'author': {'uid': '385c1458d707bd373cc47048f7f5be22', 'author_name': 'kstest', 'author_avatar': 'https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg', 'login_name': 'kstest', 'phone': '17711067850', 'email': '', 'create_time': 1516856296, 'flag': 0, 'description': ''}, 'pictures': [{'id': 9169, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/09b79f16080a773eb25b207c76c2eca6.jpg'}], 'tags': None}, {'id': 3804, 'uid': '385c1458d707bd373cc47048f7f5be22', 'title': '林若Abby:年头觉得自己还蛮好看的 年尾怎么就长残了', 'summary': '一个人躲在角落里,悄悄的落泪,因为没有人值得我倾诉。', 'state': 0, 'browse': 6, 'created': '2020-12-22T03:01:22+08:00', 'classify': {'id': 2, 'name': '清纯美女', 'slug': 'qcmn'}, 'author': {'uid': '385c1458d707bd373cc47048f7f5be22', 'author_name': 'kstest', 'author_avatar': 'https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg', 'login_name': 'kstest', 'phone': '17711067850', 'email': '', 'create_time': 1516856296, 'flag': 0, 'description': ''}, 'pictures': [{'id': 9168, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/8e190cb74b4b0f0ba526935b17aee9d8.jpg'}], 'tags': None}, {'id': 3803, 'uid': '385c1458d707bd373cc47048f7f5be22', 'title': '林若:怀念半年前傻傻憨憨 快快乐乐的自己 \u200b\u200b\u200b\u200b', 'summary': '有裂痕的愛怎麽重蓋、悲傷要怎麽平靜純白。', 'state': 0, 'browse': 7, 'created': '2020-12-22T03:00:07+08:00', 'classify': {'id': 2, 'name': '清纯美女', 'slug': 'qcmn'}, 'author': {'uid': '385c1458d707bd373cc47048f7f5be22', 'author_name': 'kstest', 'author_avatar': 'https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg', 'login_name': 'kstest', 'phone': '17711067850', 'email': '', 'create_time': 1516856296, 'flag': 0, 'description': ''}, 'pictures': [{'id': 9167, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/91b162607c9d204916f3d0d803ffcb75.jpg'}], 'tags': None}, {'id': 3800, 'uid': '385c1458d707bd373cc47048f7f5be22', 'title': '杏子大人:情书也撕了,酒杯也碎了,别担心,你走吧,我不爱你了', 'summary': '很多事情没有来日方长,很多人都只会乍然离场。一组伤感的个性签名送给你们,希望伤心难过时能带给你一丝慰藉。', 'state': 0, 'browse': 7, 'created': '2020-12-22T02:57:09+08:00', 'classify': {'id': 2, 'name': '清纯美女', 'slug': 'qcmn'}, 'author': {'uid': '385c1458d707bd373cc47048f7f5be22', 'author_name': 'kstest', 'author_avatar': 'https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg', 'login_name': 'kstest', 'phone': '17711067850', 'email': '', 'create_time': 1516856296, 'flag': 0, 'description': ''}, 'pictures': [{'id': 9164, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/b7ef615e0c03c602095d0092d370ba5c.jpg'}], 'tags': None}, {'id': 3799, 'uid': '385c1458d707bd373cc47048f7f5be22', 'title': '杏子大人:微笑就像创可贴,掩饰了伤口,痛还在。', 'summary': '你的名字,写下来不过几厘米那么短,却贯穿了我那么长的时光。', 'state': 0, 'browse': 9, 'created': '2020-12-22T02:56:24+08:00', 'classify': {'id': 2, 'name': '清纯美女', 'slug': 'qcmn'}, 'author': {'uid': '385c1458d707bd373cc47048f7f5be22', 'author_name': 'kstest', 'author_avatar': 'https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg', 'login_name': 'kstest', 'phone': '17711067850', 'email': '', 'create_time': 1516856296, 'flag': 0, 'description': ''}, 'pictures': [{'id': 9162, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/7a04b64835cdb0e8a666caec25bb64d3.jpg'}, {'id': 9163, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/74af547d45e9e329497cd10272608686.jpg'}], 'tags': None}, {'id': 3793, 'uid': '385c1458d707bd373cc47048f7f5be22', 'title': '杏子大人:更可笑的是我瞒着所有人继续爱了你好久。', 'summary': '当身边结婚的朋友越来越多,当朋友圈晒娃的越来越多,有时候想想,大概是等不到你了。', 'state': 0, 'browse': 2, 'created': '2020-12-22T02:51:21+08:00', 'classify': {'id': 2, 'name': '清纯美女', 'slug': 'qcmn'}, 'author': {'uid': '385c1458d707bd373cc47048f7f5be22', 'author_name': 'kstest', 'author_avatar': 'https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg', 'login_name': 'kstest', 'phone': '17711067850', 'email': '', 'create_time': 1516856296, 'flag': 0, 'description': ''}, 'pictures': [{'id': 9147, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/7000fb6c54306bc2cd46bc593582fe70.jpg'}, {'id': 9148, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/7e59bcd7d53ab6fe7dfc11cc58d7cb51.jpg'}, {'id': 9149, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/ffa75d59348399f5495fbb98eae8d12d.jpg'}], 'tags': None}, {'id': 3792, 'uid': '385c1458d707bd373cc47048f7f5be22', 'title': '杏子大人:不敢再炫耀身边有谁,害怕你突然间离开让我尴尬。', 'summary': '可不可以不要靠近我,了解我,心疼我,然后再离开我。', 'state': 0, 'browse': 3, 'created': '2020-12-22T02:50:06+08:00', 'classify': {'id': 2, 'name': '清纯美女', 'slug': 'qcmn'}, 'author': {'uid': '385c1458d707bd373cc47048f7f5be22', 'author_name': 'kstest', 'author_avatar': 'https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg', 'login_name': 'kstest', 'phone': '17711067850', 'email': '', 'create_time': 1516856296, 'flag': 0, 'description': ''}, 'pictures': [{'id': 9145, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/7a0c2d7c44d85a7beb9d0e553c051d2d.jpg'}, {'id': 9146, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/6fd1dc7a272476d1dafba1d266e65c50.jpg'}], 'tags': None}, {'id': 3791, 'uid': '385c1458d707bd373cc47048f7f5be22', 'title': '就是LYN:时光让我们相聚,时光却也让我们分离。', 'summary': '我走不进你的心,写不出你的梦,不管我付出多少都是个外人。', 'state': 0, 'browse': 5, 'created': '2020-12-22T02:49:14+08:00', 'classify': {'id': 2, 'name': '清纯美女', 'slug': 'qcmn'}, 'author': {'uid': '385c1458d707bd373cc47048f7f5be22', 'author_name': 'kstest', 'author_avatar': 'https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg', 'login_name': 'kstest', 'phone': '17711067850', 'email': '', 'create_time': 1516856296, 'flag': 0, 'description': ''}, 'pictures': [{'id': 9143, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/f863993340a222f2911d8c039ae1f58a.jpg'}], 'tags': None}]}

我们要的内容再list中
Python3爬虫批量爬取图片并保存到本地_第8张图片
获取list对应的值:

import requests
import json
import re

page=1
path='qcmn'
url='https://www.0xu.cn/gallery/'+path
url='https://www.0xu.cn/gallery/list?page='+str(page) +'&category='+path

def get_html(url):
    html = requests.get(url)
    html=html.text
    #掐头去尾
    html = html.replace('{"response":{"status":1,"message":"请求成功","data":', '').replace('}}', '')
    # 将 JSON 对象转换为 Python 字典
    html = json.loads(html)
    # print(type(html))
    # print(html)
    list1=html['list']

    print(len(list1))
    print(list1)
    # for dict1 in list1:
    #     print(type(dict1))
    #     list2=dict1['pictures']
    #     print(list2)
    #     print(type(list2))


get_html(url)

返回结果为:
在这里插入图片描述

[{'id': 3807, 'uid': '385c1458d707bd373cc47048f7f5be22', 'title': '林若:又回到这座灯火阑珊的旧城,只是再也没有你陪我的黄昏。', 'summary': '情绪低落时,现实的你,网络上的你,一个假装快乐,一个真心难过。', 'state': 0, 'browse': 17, 'created': '2020-12-22T03:03:08+08:00', 'classify': {'id': 2, 'name': '清纯美女', 'slug': 'qcmn'}, 'author': {'uid': '385c1458d707bd373cc47048f7f5be22', 'author_name': 'kstest', 'author_avatar': 'https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg', 'login_name': 'kstest', 'phone': '17711067850', 'email': '', 'create_time': 1516856296, 'flag': 0, 'description': ''}, 'pictures': [{'id': 9171, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/1a2ebb29df8634ea18fab0438aef067c.jpg'}, {'id': 9172, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/48260abe57dabd163a824eb0de8f540d.jpg'}], 'tags': None}, {'id': 3806, 'uid': '385c1458d707bd373cc47048f7f5be22', 'title': '林若:在某个时刻(存照) \u200b\u200b\u200b\u200b', 'summary': '人总是这样,终于到了懂得珍惜的年纪,却偏偏什么都走散了。', 'state': 0, 'browse': 9, 'created': '2020-12-22T03:02:23+08:00', 'classify': {'id': 2, 'name': '清纯美女', 'slug': 'qcmn'}, 'author': {'uid': '385c1458d707bd373cc47048f7f5be22', 'author_name': 'kstest', 'author_avatar': 'https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg', 'login_name': 'kstest', 'phone': '17711067850', 'email': '', 'create_time': 1516856296, 'flag': 0, 'description': ''}, 'pictures': [{'id': 9170, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/182ea3276e49c000284f914dfe5ee4de.jpg'}], 'tags': None}, {'id': 3805, 'uid': '385c1458d707bd373cc47048f7f5be22', 'title': '林若:他知道你舍不得离开才肆无忌惮的伤害。', 'summary': '这一生,这一世,因为不再有你,所以爱情轰然老去。', 'state': 0, 'browse': 5, 'created': '2020-12-22T03:01:50+08:00', 'classify': {'id': 2, 'name': '清纯美女', 'slug': 'qcmn'}, 'author': {'uid': '385c1458d707bd373cc47048f7f5be22', 'author_name': 'kstest', 'author_avatar': 'https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg', 'login_name': 'kstest', 'phone': '17711067850', 'email': '', 'create_time': 1516856296, 'flag': 0, 'description': ''}, 'pictures': [{'id': 9169, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/09b79f16080a773eb25b207c76c2eca6.jpg'}], 'tags': None}, {'id': 3804, 'uid': '385c1458d707bd373cc47048f7f5be22', 'title': '林若Abby:年头觉得自己还蛮好看的 年尾怎么就长残了', 'summary': '一个人躲在角落里,悄悄的落泪,因为没有人值得我倾诉。', 'state': 0, 'browse': 6, 'created': '2020-12-22T03:01:22+08:00', 'classify': {'id': 2, 'name': '清纯美女', 'slug': 'qcmn'}, 'author': {'uid': '385c1458d707bd373cc47048f7f5be22', 'author_name': 'kstest', 'author_avatar': 'https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg', 'login_name': 'kstest', 'phone': '17711067850', 'email': '', 'create_time': 1516856296, 'flag': 0, 'description': ''}, 'pictures': [{'id': 9168, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/8e190cb74b4b0f0ba526935b17aee9d8.jpg'}], 'tags': None}, {'id': 3803, 'uid': '385c1458d707bd373cc47048f7f5be22', 'title': '林若:怀念半年前傻傻憨憨 快快乐乐的自己 \u200b\u200b\u200b\u200b', 'summary': '有裂痕的愛怎麽重蓋、悲傷要怎麽平靜純白。', 'state': 0, 'browse': 7, 'created': '2020-12-22T03:00:07+08:00', 'classify': {'id': 2, 'name': '清纯美女', 'slug': 'qcmn'}, 'author': {'uid': '385c1458d707bd373cc47048f7f5be22', 'author_name': 'kstest', 'author_avatar': 'https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg', 'login_name': 'kstest', 'phone': '17711067850', 'email': '', 'create_time': 1516856296, 'flag': 0, 'description': ''}, 'pictures': [{'id': 9167, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/91b162607c9d204916f3d0d803ffcb75.jpg'}], 'tags': None}, {'id': 3800, 'uid': '385c1458d707bd373cc47048f7f5be22', 'title': '杏子大人:情书也撕了,酒杯也碎了,别担心,你走吧,我不爱你了', 'summary': '很多事情没有来日方长,很多人都只会乍然离场。一组伤感的个性签名送给你们,希望伤心难过时能带给你一丝慰藉。', 'state': 0, 'browse': 7, 'created': '2020-12-22T02:57:09+08:00', 'classify': {'id': 2, 'name': '清纯美女', 'slug': 'qcmn'}, 'author': {'uid': '385c1458d707bd373cc47048f7f5be22', 'author_name': 'kstest', 'author_avatar': 'https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg', 'login_name': 'kstest', 'phone': '17711067850', 'email': '', 'create_time': 1516856296, 'flag': 0, 'description': ''}, 'pictures': [{'id': 9164, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/b7ef615e0c03c602095d0092d370ba5c.jpg'}], 'tags': None}, {'id': 3799, 'uid': '385c1458d707bd373cc47048f7f5be22', 'title': '杏子大人:微笑就像创可贴,掩饰了伤口,痛还在。', 'summary': '你的名字,写下来不过几厘米那么短,却贯穿了我那么长的时光。', 'state': 0, 'browse': 9, 'created': '2020-12-22T02:56:24+08:00', 'classify': {'id': 2, 'name': '清纯美女', 'slug': 'qcmn'}, 'author': {'uid': '385c1458d707bd373cc47048f7f5be22', 'author_name': 'kstest', 'author_avatar': 'https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg', 'login_name': 'kstest', 'phone': '17711067850', 'email': '', 'create_time': 1516856296, 'flag': 0, 'description': ''}, 'pictures': [{'id': 9162, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/7a04b64835cdb0e8a666caec25bb64d3.jpg'}, {'id': 9163, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/74af547d45e9e329497cd10272608686.jpg'}], 'tags': None}, {'id': 3793, 'uid': '385c1458d707bd373cc47048f7f5be22', 'title': '杏子大人:更可笑的是我瞒着所有人继续爱了你好久。', 'summary': '当身边结婚的朋友越来越多,当朋友圈晒娃的越来越多,有时候想想,大概是等不到你了。', 'state': 0, 'browse': 2, 'created': '2020-12-22T02:51:21+08:00', 'classify': {'id': 2, 'name': '清纯美女', 'slug': 'qcmn'}, 'author': {'uid': '385c1458d707bd373cc47048f7f5be22', 'author_name': 'kstest', 'author_avatar': 'https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg', 'login_name': 'kstest', 'phone': '17711067850', 'email': '', 'create_time': 1516856296, 'flag': 0, 'description': ''}, 'pictures': [{'id': 9147, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/7000fb6c54306bc2cd46bc593582fe70.jpg'}, {'id': 9148, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/7e59bcd7d53ab6fe7dfc11cc58d7cb51.jpg'}, {'id': 9149, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/ffa75d59348399f5495fbb98eae8d12d.jpg'}], 'tags': None}, {'id': 3792, 'uid': '385c1458d707bd373cc47048f7f5be22', 'title': '杏子大人:不敢再炫耀身边有谁,害怕你突然间离开让我尴尬。', 'summary': '可不可以不要靠近我,了解我,心疼我,然后再离开我。', 'state': 0, 'browse': 3, 'created': '2020-12-22T02:50:06+08:00', 'classify': {'id': 2, 'name': '清纯美女', 'slug': 'qcmn'}, 'author': {'uid': '385c1458d707bd373cc47048f7f5be22', 'author_name': 'kstest', 'author_avatar': 'https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg', 'login_name': 'kstest', 'phone': '17711067850', 'email': '', 'create_time': 1516856296, 'flag': 0, 'description': ''}, 'pictures': [{'id': 9145, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/7a0c2d7c44d85a7beb9d0e553c051d2d.jpg'}, {'id': 9146, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/6fd1dc7a272476d1dafba1d266e65c50.jpg'}], 'tags': None}, {'id': 3791, 'uid': '385c1458d707bd373cc47048f7f5be22', 'title': '就是LYN:时光让我们相聚,时光却也让我们分离。', 'summary': '我走不进你的心,写不出你的梦,不管我付出多少都是个外人。', 'state': 0, 'browse': 5, 'created': '2020-12-22T02:49:14+08:00', 'classify': {'id': 2, 'name': '清纯美女', 'slug': 'qcmn'}, 'author': {'uid': '385c1458d707bd373cc47048f7f5be22', 'author_name': 'kstest', 'author_avatar': 'https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg', 'login_name': 'kstest', 'phone': '17711067850', 'email': '', 'create_time': 1516856296, 'flag': 0, 'description': ''}, 'pictures': [{'id': 9143, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/f863993340a222f2911d8c039ae1f58a.jpg'}], 'tags': None}]

返回的是一个列表

接下来遍历list中的索引,可以看出,列表中的每一个对应一个字典
Python3爬虫批量爬取图片并保存到本地_第9张图片
for循环遍历所有字典并取pictures对应的值
Python3爬虫批量爬取图片并保存到本地_第10张图片

对应代码:

import requests
import json
import re

page=1
path='qcmn'
url='https://www.0xu.cn/gallery/'+path
url='https://www.0xu.cn/gallery/list?page='+str(page) +'&category='+path

def get_html(url):
    html = requests.get(url)
    html=html.text
    #掐头去尾
    html = html.replace('{"response":{"status":1,"message":"请求成功","data":', '').replace('}}', '')
    # 将 JSON 对象转换为 Python 字典
    html = json.loads(html)
    # print(type(html))
    # print(html)
    #获取字典中list对应的值(是一个列表list1)
    list1=html['list']
    # print(type(list1))
    #
    # print(len(list1))
    # print(list1)
    #for循环遍历列表list1中所有的值(也就是每一组字典)
    for dict1 in list1:
        print(type(dict1))
        #从字典中获取pictures对应的值(img_url在pictures中)
        list2=dict1['pictures']
        #print(list2)
        print(type(list2))
        


get_html(url)

运行结果:

<class 'list'>
<class 'list'>
<class 'list'>
<class 'list'>
<class 'list'>
<class 'list'>
<class 'list'>
<class 'list'>
<class 'list'>
<class 'list'>


list2是列表的格式
然后再遍历list2中的所有值,并取出img_url

在这里插入图片描述

import requests
import json
import re

page=2
path='qcmn'
url='https://www.0xu.cn/gallery/'+path
url='https://www.0xu.cn/gallery/list?page='+str(page) +'&category='+path

def get_html(url):
    html = requests.get(url)
    html=html.text
    #掐头去尾
    html = html.replace('{"response":{"status":1,"message":"请求成功","data":', '').replace('}}', '')
    # 将 JSON 对象转换为 Python 字典
    html = json.loads(html)
    # print(type(html))
    # print(html)
    #获取字典中list对应的值(是一个列表list1)
    list1=html['list']
    # print(type(list1))
    #
    # print(len(list1))
    # print(list1)
    #for循环遍历列表list1中所有的值(也就是每一组字典)
    for dict1 in list1:
        #print(type(dict1))
        #从字典中获取pictures对应的值(img_url在pictures中)
        list2=dict1['pictures']
        print()
        #print(list2)
        #print(type(list2))
        for img_urls in list2:
            img_url=img_urls['img_url']

            print(img_url)





get_html(url)

运行结果:
Python3爬虫批量爬取图片并保存到本地_第11张图片

**

二、爬取图片链接并保存到本地

**

 #遍历list2中的所有值(字典),并取出img_url对应的值
        for img_urls in list2:
            img_url=img_urls['img_url']

            i = i + 1
            #print(i)
            #print(img_url)
            try:
                pic = requests.get(img_url, timeout=10)

                with open('./images/{0}.jpg'.format(str(i)),"wb")  as f:

                    print("正在下载第{0}张照片:".format(str(i)))

                    f.write(pic.content)
                    f.close()


            except requests.exceptions.ConnectionError:
                print('当前图片无法下载')
                continue

运行结果:

Python3爬虫批量爬取图片并保存到本地_第12张图片
Python3爬虫批量爬取图片并保存到本地_第13张图片

然后更换路径为长腿美女也可以。

搞完之后我发现有点尴尬,多此一举了。。
https://www.0xu.cn/gallery/qcmn/1
直接爆破这个参数就行。(每个类型1-3500个差不多)

import requests
import json
import re
import os
page=1
path='ctmn'
url='https://www.0xu.cn/gallery/'+path
url='https://www.0xu.cn/gallery/list?page='+str(page) +'&category='+path

def get_html(url):
    i = 0
    html = requests.get(url)
    html=html.text
    #掐头去尾
    html = html.replace('{"response":{"status":1,"message":"请求成功","data":', '').replace('}}', '')
    # 将 JSON 对象转换为 Python 字典
    html = json.loads(html)
    # print(type(html))
    # print(html)
    #获取字典中list对应的值(是一个列表list1)
    list1=html['list']
    # print(type(list1))
    #
    # print(len(list1))
    # print(list1)
    #for循环遍历列表list1中所有的值(也就是每一组字典)
    for dict1 in list1:
        #print(type(dict1))
        #从字典中获取pictures对应的值(img_url在pictures中)
        list2=dict1['pictures']
        print()
        #print(list2)
        #print(type(list2))


        #遍历list2中的所有值(字典),并取出img_url对应的值
        for img_urls in list2:
            img_url=img_urls['img_url']

            i = i + 1
            #print(i)
            #print(img_url)
            try:
                pic = requests.get(img_url, timeout=10)
                lujin = './images/'
                if not os.path.isdir(lujin):
                    os.makedirs(lujin)


                with open(lujin +'{0}.jpg'.format(str(i)),"wb")  as f:

                    print("正在下载第{0}张照片:".format(str(i)))

                    f.write(pic.content)
                    f.close()


            except requests.exceptions.ConnectionError:
                print('当前图片无法下载')
                continue









get_html(url)

热爱网络安全和python的小伙伴可以关注下我的公众号。
上边是完整的代码。后续改一版会放在公众号上,预计这周末有时间。
Python3爬虫批量爬取图片并保存到本地_第14张图片

你可能感兴趣的:(Python3,Python3爬虫入门)