嗨喽,大家好,这里是魔王
Python批量下载某视频
短视频平台相信大家在家时都刷过,那么你有没有碰到过你喜欢的作者得视频突然就没了,不能看了呢?
为了防止此类事情得发生,这次我来教大家把你喜欢得作者视频批量下载到本地慢慢看
动态数据抓包 动态页面分析 requests携带参数发送请求 json数据解析
requests >>> pip install requests
1. 确定需求 (要爬取的内容是什么?)
爬取用户下对应的视频 保存mp4
2. 通过开发者工具进行抓包分析 分析数据从哪里来的(找出真正的数据来源)?
动态加载页面 开发者工具抓数据包
https://www.kuaishou.com/graphql
做开发的时候 一般来说 开发人员 统一全部用谷歌
1. 找到目标网址
2. 发送请求
1.get post
3. 解析数据 (获取视频地址 视频标题)
4. 发送请求 请求每个视频地址
5. 保存视频
import requests # 发送网络请求
import json
https://www.kuaishou.com/profile/3xv78fxycm35nn4
url = 'https://www.kuaishou.com/graphql'
# 统一替换
# 1.选中要替换的内容
# 2.按住Ctrl+R 注: 点亮星号* / 2021版本一下 点亮Regex
# 3.在第一个框里面输入(.*?): (.*)
# 4.在第二个框里面输入'$1': '$2',
# 5.点击REPLACE ALL
headers = {
# content-type: 代表json类型
'content-type': 'application/json',
'Cookie': 'kpf=PC_WEB; kpn=KUAISHOU_VISION; clientid=3; did=web_9d70ec30050b94472cb5b0b56650bfef; client_key=65890b29; userId=270932146; kuaishou.server.web_st=ChZrdWFpc2hvdS5zZXJ2ZXIud2ViLnN0EqABw0yKIfhs_Wdd_A9LYQJSj8fYJgP3h5CfohTpJHme6amEhXivejDIgyNUt0axPa-FAqOns91249zR0VNE4HUyxoEcUv96ZI0hstJJ0rIbUTcIzrhZc4TIkQoUbvo8t-Jpi0JfzHomFMTzpkAWUguRFK4MWdpgsR2au4lRrLBS3fsaHdD-n1Q3U9K8MgB5NzggjhCgJkW-11DZiCN7XnmPXRoSTdCMiCqspRXB3AhuFugv61B-IiCZcAb_g7moUDKkymY8Y9V2ruI0Jvt3o3xfHE-xzrZtiSgFMAE; kuaishou.server.web_ph=9eab58a431c308cac5cfcc644147e03b0593',
'Host': 'www.kuaishou.com',
'Origin': 'https://www.kuaishou.com',
'Referer': 'https://www.kuaishou.com/profile/3xv78fxycm35nn4',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.45 Safari/537.36',
}
# 请求参数 data
data = {
'operationName': "visionProfilePhotoList",
'query': "query visionProfilePhotoList($pcursor: String, $userId: String, $page: String, $webPageArea: String) {\n visionProfilePhotoList(pcursor: $pcursor, userId: $userId, page: $page, webPageArea: $webPageArea) {\n result\n llsid\n webPageArea\n feeds {\n type\n author {\n id\n name\n following\n headerUrl\n headerUrls {\n cdn\n url\n __typename\n }\n __typename\n }\n tags {\n type\n name\n __typename\n }\n photo {\n id\n duration\n caption\n likeCount\n realLikeCount\n coverUrl\n coverUrls {\n cdn\n url\n __typename\n }\n photoUrls {\n cdn\n url\n __typename\n }\n photoUrl\n liked\n timestamp\n expTag\n animatedCoverUrl\n stereoType\n videoRatio\n profileUserTopPhoto\n __typename\n }\n canAddComment\n currentPcursor\n llsid\n status\n __typename\n }\n hostName\n pcursor\n __typename\n }\n}\n",
'variables': {'userId': "3xv78fxycm35nn4", 'pcursor': "", 'page': "profile"}
}
data = json.dumps(data)
# : 发送请求成功
response = requests.post(url=url, headers=headers, data=data)
json_data = response.json()
feeds = json_data['data']['visionProfilePhotoList']['feeds']
for feed in feeds:
caption = feed['photo']['caption']
video_url = feed['photo']['photoUrl']
video_data = requests.get(video_url).content
with open(f'video\\{caption}.mp4', mode='wb') as f:
f.write(video_data)
print(caption, '下载成功!!!')
headers = {
'content-type': 'application/json',
'Cookie': '你自己的cookie',
'Host': 'www.kuaishou.com',
'Origin': 'https://www.kuaishou.com',
'Referer': 'https://www.kuaishou.com/profile/3xv78fxycm35nn4',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.69 Safari/537.36' }
data = {
'operationName': "visionProfilePhotoList",
'query': "query visionProfilePhotoList($pcursor: String, $userId: String, $page:
String, $webPageArea: String) {\n visionProfilePhotoList(pcursor: $pcursor, userId:
$userId, page: $page, webPageArea: $webPageArea) {\n result\n llsid\n webPageArea\n feeds {\n type\n author {\n id\n name\nfollowing\n headerUrl\n headerUrls {\n cdn\n url\n __typename\n }\n __typename\n }\n tags {\n type\n name\n __typename\n }\n photo {\n id\n duration\n caption\n likeCount\n realLikeCount\n coverUrl\n coverUrls {\n cdn\n url\n __typename\n }\n photoUrls {\n cdn\n url\n __typename\n }\n photoUrl\n liked\n timestamp\n expTag\n animatedCoverUrl\n stereoType\n videoRatio\n profileUserTopPhoto\n __typename\n }\n canAddComment\n currentPcursor\n llsid\n status\n __typename\n }\n hostName\n pcursor\n __typename\n }\n}\n",
'variables': {'userId': "3x9dquvtb9n9fps", 'pcursor': "", 'page': "profile"}
}
好了,我的这篇文章写到这里就结束啦!
有更多建议或问题可以评论区或私信我哦!一起加油努力叭(ง •_•)ง
喜欢就关注一下博主,或点赞收藏评论一下我的文章叭!!!