网易云音乐python爬虫搜索接口,评论接口,歌词接口

网易云搜索歌曲,歌曲评论抓取,歌词抓取,仅供参考。

网易云音乐接口分析

网易云接口都是同一套加密代码,不同得API对应不同的请求参数,本文以搜索歌曲API为例子详细介绍加密代码,js AES 加密

搜索歌曲接口解析

1.搜索歌曲 《在一起》

2.搜索歌曲接口

2.1 第一步找到搜索歌曲接口

网易云音乐python爬虫搜索接口,评论接口,歌词接口_第1张图片
网易云音乐python爬虫搜索接口,评论接口,歌词接口_第2张图片

2.2 第二步 通过关键词全局搜索 params 和 encSecKey,找到对应的代码文件

网易云音乐python爬虫搜索接口,评论接口,歌词接口_第3张图片

2.3 第三步 通过断点调试定位具体加密代码位置

网易云音乐python爬虫搜索接口,评论接口,歌词接口_第4张图片

2.4 分析 代码块

var bVV0x = window.asrsea(JSON.stringify(i3x), bqM3x(["流泪", "强"]), bqM3x(TM7F.md), bqM3x(["爱心", "女孩", "惊恐", "大笑"]));
其中 
asrsea( )未知函数
bqM3x( ) 未知函数
TM7F.md 未知参数
i3x:未知参数
2.4.1 向上分析代码 文件 13059行找到 bqM3x函数 以及TM7F
    var bqM3x = function(cyu6o) {
        var m3x = [];
        k3x.bf4j(cyu6o, function(cyt6n) {
            m3x.push(TM7F.emj[cyt6n])
        });
        return m3x.join("")
    };
    TM7F.md = ["色", "流感", "这边", "弱", "嘴唇", "亲", "开心", "呲牙", "憨笑", "猫", "皱眉", "幽灵", "蛋糕", "发怒", "大哭", "兔子", "星星", "钟情", "牵手", "公鸡", "爱意", "禁止", "狗", "亲亲", "叉", "礼物", "晕", "呆", "生病", "钻石", "拜", "怒", "示爱", "汗", "小鸡", "痛苦", "撇嘴", "惶恐", "口罩", "吐舌", "心碎", "生气", "可爱", "鬼脸", "跳舞", "男孩", "奸笑", "猪", "圈", "便便", "外星", "圣诞"]
    TM7F.emj = {
        "色": "00e0b",
        "流感": "509f6",
        "这边": "259df",
        "弱": "8642d",
        "嘴唇": "bc356",
        "亲": "62901",
        "开心": "477df",
        "呲牙": "22677",
        "憨笑": "ec152",
        "猫": "b5ff6",
        "皱眉": "8ace6",
        "幽灵": "15bb7",
        "蛋糕": "b7251",
        "发怒": "52b3a",
        "大哭": "b17a8",
        "兔子": "76aea",
        "星星": "8a5aa",
        "钟情": "76d2e",
        "牵手": "41762",
        "公鸡": "9ec4e",
        "爱意": "e341f",
        "禁止": "56135",
        "狗": "fccf6",
        "亲亲": "95280",
        "叉": "104e0",
        "礼物": "312ec",
        "晕": "bda92",
        "呆": "557c9",
        "生病": "38701",
        "钻石": "14af6",
        "拜": "c9d05",
        "怒": "c4f7f",
        "示爱": "0c368",
        "汗": "5b7a4",
        "小鸡": "6bee2",
        "痛苦": "55932",
        "撇嘴": "575cc",
        "惶恐": "e10b4",
        "口罩": "24d81",
        "吐舌": "3cfe4",
        "心碎": "875d3",
        "生气": "e8204",
        "可爱": "7b97d",
        "鬼脸": "def52",
        "跳舞": "741d5",
        "男孩": "46b8e",
        "奸笑": "289dc",
        "猪": "6935b",
        "圈": "3ece0",
        "便便": "462db",
        "外星": "0a22b",
        "圣诞": "8e7",
        "流泪": "01000",
        "强": "1",
        "爱心": "0CoJU",
        "女孩": "m6Qyw",
        "惊恐": "8W8ju",
        "大笑": "d"
    };

看都不想看。。。。。。 三个参数都是固定的 在当前环境下 执行语句获得结果
在这里插入图片描述
bqM3x([“流泪”, “强”])
**010001**
bqM3x(TM7F.md)
"00e0b509f6259df8642dbc35662901477df22677ec152b5ff68ace615bb7b725152b3ab17a876aea8a5aa76d2e417629ec4ee341f56135fccf695280104e0312ecbda92557c93870114af6c9d05c4f7f0c3685b7a46bee255932575cce10b424d813cfe4875d3e82047b97ddef52741d546b8e289dc6935b3ece0462db0a22b8e7"
bqM3x([“爱心”, “女孩”, “惊恐”, “大笑”])
"0CoJUm6Qyw8W8jud"

2.4.2 分析 i3x参数 和 asrsea函数

网易云音乐python爬虫搜索接口,评论接口,歌词接口_第5张图片
其中的 e,f,g参数均已知 (可以搭建后台API直接掉用 其中的CryptoJS 自行倒入)

    function a(a) {
        var d, e, b = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789", c = "";
        for (d = 0; a > d; d += 1)
            e = Math.random() * b.length,
            e = Math.floor(e),
            c += b.charAt(e);
        return c
    }
    function b(a, b) {
        var c = CryptoJS.enc.Utf8.parse(b)
          , d = CryptoJS.enc.Utf8.parse("0102030405060708")
          , e = CryptoJS.enc.Utf8.parse(a)
          , f = CryptoJS.AES.encrypt(e, c, {
            iv: d,
            mode: CryptoJS.mode.CBC
        });
        return f.toString()
    }
    function c(a, b, c) {
        var d, e;
        return setMaxDigits(131),
        d = new RSAKeyPair(b,"",c),
        e = encryptedString(d, a)
    }
    function d(d, e, f, g) {
        var h = {}
          , i = a(16);
        return h.encText = b(d, g),
        h.encText = b(h.encText, i),
        h.encSecKey = c(i, e, f),
        h
    }
    function e(a, b, d, e) {
        var f = {};
        return f.encText = c(a + e, b, d),
        f
    }
2.4.3 python代码复写js (这里开始就是很烦的了 不想搭建后台,直接调用js代码,请求CryptoJS文件会有问题,没办法只能用python代码模拟CryptoJS执行加密)

python 有第三方库 Crypto 可执行AES 加密(有坑)
直接pip install Crypto (会报错 找不到此model)
正确下载

pip install pycryptodome

·使用第三方库 Crypto执行加密操作网上很多坑(自己的见解也很浅)下面的是成功可执行代码(注意js里面加密使用了两次,每次不同的加密参数)具体详细使用见文末github地址

# js代码中b函数的复写
    def crypt_js_complex_base(text,key):
        BS = AES.block_size
        pad = lambda s: s + (BS - len(s) % BS) * chr(BS - len(s) % BS).encode('utf-8')
        unpad = lambda s: s[0:-s[-1]]

        key = bytes(key,encoding="utf-8")
        text = text.encode("utf-8")
        IV = b'0102030405060708' # 默认参数

        cipher = AES.new(key, mode=AES.MODE_CBC, IV=IV)
        # cipher2 = AES.new(key, mode=AES.MODE_CBC, IV=IV)  # 加密和解密,cipher对象只能用一次

        # print(text)
        encrypted = pad(text)
        # print(encrypted)
        encrypted = cipher.encrypt(encrypted)
        # print(encrypted)
        encrypted = base64.b64encode(encrypted).decode("utf-8")
        # print("第一次加密结果", encrypted)
        return encrypted

    
2.4.4 下面直接贴可执行python代码 我知道大家都懒,

使用方法见文末github地址

# -*- coding:utf-8 -*-
# 享受雷霆感受雨露
# author xyy,time:2020/6/28

from fake_useragent import UserAgent
from Crypto.Cipher import AES

import base64
import requests, pprint,json


class WangYiYun():

    def __init__(self):
        self.params = ""
        self._i = "l6Brr86UeZ6C3Bsw" # 默认使用此字符串
        # 使用默认_i 配套的encSecKey
        self.encSecKey = "7ca9b5ba8b13044f47ed74c388df912ac84758122acbedc64111f2ac83232b01d3ce16f7195a39c7e064b4c0240b5c1d52624dc13c22ec820d76dfe32db43e496aeacced5be3ca9108c78a85bb389f1edf8d8c9fced02024ba9490401b4ce062cc50764d0a24294e07bb229271391b5a3640e924ee1ed15435dc6e288f1fa873"
        self.headers =  {
            'authority': 'music.163.com',
            'user-agent': UserAgent().random,
            'content-type': 'application/x-www-form-urlencoded',
            'accept': '*/*',
            'origin': 'https://music.163.com',
            'sec-fetch-site': 'same-origin',
            'sec-fetch-mode': 'cors',
            'sec-fetch-dest': 'empty',
            'referer': 'https://music.163.com/song?id=1426301364',
            'accept-language': 'zh-CN,zh;q=0.9',
            'cookie': '_iuqxldmzr_=32; _ntes_nnid=5f8ee04e745645d13d3f711c76769afe,1593048942478; _ntes_nuid=5f8ee04e745645d13d3f711c76769afe; WM_TID=XqvK2%2FtWaSBEUBRBEEN7XejGE%2FL0h6Vq; WM_NI=iN6dugAs39cIm2K2R9ox28GszTm5oRjcvJCcyIuaI1dccEVSjaHEwhc8FuERfkh3s%2FFP0zniMA5P4vqS4H3TJKdQofPqezDPP4IR5ApTjuqeNIJNZkCvHMSY6TtEkCZUS3k%3D; WM_NIKE=9ca17ae2e6ffcda170e2e6eeb2e57dbababf88b879a8b08fa2d84f869f9fbaaa50a3f599a5d650939b8dadd52af0fea7c3b92aab92fa85f86d83adfddae243afee85d3d133ada8fed9c679ba8ca3d6ee5aaabdbaabc269bb97bb82cc3ba8bdada6d559aabf88a6f664a1e88a96c85aa6b5a8d4f2258690009bed638f9ffbb1b77eb38dfca9b2608a95acb2ee6e94afab9bc75c94ec87b3b84bb48ca696f46f8e9786afd96181aa88aed253f68cbca6ea499a8b9dd4ea37e2a3; JSESSIONID-WYYY=tI8MIKMCRBuyCYnUJMCyUTlp%2Fufv5xIfCquvp7PJ4%2BuXod%5CXH%5CB0icDZw8TNlwHUHOW%2B2t%2BCuXyC4VZ%5C19OrzaDE%5Ck0F0dAZQh7KcVxUoHKpqUdiVzPu8NxCK9cJRG%5C%5CPTvtqxjFerd1%2BBa4%2F%5C8PESa4pvvRaQF6jljjsibX%5CrcPsH0I%3A1593347447142',
        }

    # 搜索歌曲接口
    API_Serch_Songs = 'https://music.163.com/weapi/cloudsearch/get/web?csrf_token='
    # 歌曲评论
    API_Comments_Song = 'https://music.163.com/weapi/v1/resource/comments/R_SO_4_{}?csrf_token=' # 音乐ID可替换
    # 歌曲歌词
    API_Lyric_Songs = 'https://music.163.com/weapi/song/lyric?csrf_token='

    # crypt_js_complex python 复写cryptjs
    def crypt_js_complex(self,text):
        BS = AES.block_size
        pad = lambda s: s + (BS - len(s) % BS) * chr(BS - len(s) % BS).encode('utf-8')
        unpad = lambda s: s[0:-s[-1]]

        key = bytes(self._i, encoding="utf-8")
        text = text.encode("utf-8")
        IV = b'0102030405060708'

        cipher = AES.new(key, mode=AES.MODE_CBC, IV=IV)
        # cipher2 = AES.new(key, mode=AES.MODE_CBC, IV=IV)  # 加密和解密,cipher对象只能用一次

        # print(text)
        encrypted = pad(text)
        # print(encrypted)
        encrypted = cipher.encrypt(encrypted)
        # print(encrypted)
        encrypted = base64.b64encode(encrypted).decode("utf-8")
        # print("第二次加密结果", encrypted)

        return encrypted

    # crypt_js_complex 的基础
    def crypt_js_complex_base(self,text):
        BS = AES.block_size
        pad = lambda s: s + (BS - len(s) % BS) * chr(BS - len(s) % BS).encode('utf-8')
        unpad = lambda s: s[0:-s[-1]]

        key = b'0CoJUm6Qyw8W8jud'
        text = text.encode("utf-8")
        IV = b'0102030405060708'

        cipher = AES.new(key, mode=AES.MODE_CBC, IV=IV)
        # cipher2 = AES.new(key, mode=AES.MODE_CBC, IV=IV)  # 加密和解密,cipher对象只能用一次

        # print(text)
        encrypted = pad(text)
        # print(encrypted)
        encrypted = cipher.encrypt(encrypted)
        # print(encrypted)
        encrypted = base64.b64encode(encrypted).decode("utf-8")
        # print("第一次加密结果", encrypted)
        return encrypted

    # 获得parms参数值
    def get_params(self,text):
        return self.crypt_js_complex(
            self.crypt_js_complex_base(text),)

    # 搜索歌曲接口
    def serch_songs(self,name,offset=0):
        """

        :param name:str
        :param offset:int 偏移量 默认第一页 例如 0 30 60 90
        :return 接口数据
        """
        text = '{"hlpretag":"","hlposttag":"","#/discover":"","s":"%s","type":"1","offset":"%s","total":"false","limit":"30","csrf_token":""}'%(name,offset*30)
        # payload = 'params={params}&encSecKey={encSecKey}'.format(params=self.get_params(text),encSecKey=self.encSecKey)

        params = (
            ('csrf_token', ''),
        )

        data = {
            'params': self.get_params(text),
            'encSecKey': self.encSecKey
        }

        response = requests.post(self.API_Serch_Songs, headers=self.headers, params=params,
                                 data=data)
        self._dispose(json.loads(response.text))

    # 歌曲评论抓取
    def comment_song(self,songid:str,offset:int=0):
        """"
        :param str 歌曲ID
        :param int 翻页 默认第一页 0 20 40
        :return 接口数据
        """
        text = '{"rid":"R_SO_4_%s","offset":"%s","total":"true","limit":"20","csrf_token":""}'%(songid,offset*20)


        params = (
            ('csrf_token', ''),
        )

        data = {
            'params': self.get_params(text),
            'encSecKey': self.encSecKey
        }
        response = requests.post(self.API_Comments_Song.format(songid), headers=self.headers,
                                 params=params, data=data)
        self._dispose(json.loads(response.text))
    # 歌词爬取
    def lyric_song(self,songid:str):
        """
        :param songid str 歌曲ID
        :return 接口数据
        """
        # 歌词接口加密参数原型
        text = '{"id":"%s","lv":-1,"tv":-1,"csrf_token":""}'%(songid)

        params = (
            ('csrf_token', ''),
        )

        data = {
            'params': self.get_params(text),
            'encSecKey': self.encSecKey
        }

        response = requests.post(self.API_Lyric_Songs, headers=self.headers, params=params, data=data)
        self._dispose(json.loads(response.text))

    # 处理爬虫获取到的数据,这里我就输出值
    def _dispose(self, data):
        pprint.pprint(data)
        return data

    # 主函数 测试
    def wangyi_main(self):
        # 搜索接口
        # self.serch_songs("旧账",0)
        #歌曲评论接口
        self.comment_song("25639331",0)
        # 歌词接口
        # self.lyric_song("1351615757") # 旧账
        pass
if __name__ == '__main__':
    wangyi = WangYiYun()
    wangyi.wangyi_main()

歌曲评论接口解析

歌曲评论接口同上
参数:
	'{"rid":"R_SO_4_25639331","offset":"0","total":"true","limit":"20","csrf_token":""}'
API:
'https://music.163.com/weapi/v1/resource/comments/R_SO_4_{}?csrf_token=' # 音乐ID可替换

歌词接口解析

歌词接口接口同上
参数:
	'{"id":"25639331","lv":-1,"tv":-1,"csrf_token":"”}'
API:
"/weapi/song/lyric”

网易云爬虫github地址

补充

关于 No module named ‘Crypto’ 解决方案

pip3 uninstall pycryptodome

pip3 uninstall crypto

Pip3 install pycrypto

你可能感兴趣的:(爬虫档案袋)