如何查看某个用户的网易云所有评论

当你想查看某个用户写的评论,但发现设置仅自己可见,外人看不了的时候,这个时候,我们可以通过写一个python程序来实现这个操作。有需要找我代查(w-x:fas1024)可以加我,下面是开发实例:

我们可以发现,这些评论是通过向
music.163.com/weapi/v1/resource/comments/R_SO_4_26075485?csrf_token=
发起post请求得到的,期间还传入两个参数,params 和 encSecKey


也就是说我们只要通过模拟浏览器向网易云服务器发送post请求就能获得评论!
这里还要注意这个post的链接,R_SO_4_ 之后跟的一串数字实际上就是这首歌曲对应的id;而且这里需要传入的参数,也得好好分析一下(在后面)
第一步

代码如下:
headers = {

'User-Agent':'Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.87 Safari/537.36'

}

baseUrl = 'https://music.163.com'
def getHtml(url):

r = requests.get(url, headers=headers)
html = r.text
return html

def getUrl():

#从最新歌单开始
startUrl = 'https://music.163.com/discover/playlist/?order=new'
html = getHtml(startUrl)
pattern =re.compile('
  • .*?.*?<.*?title="(.*?)".*?href="(.*?)".*?>.*?span class="s-fc4".*?title="(.*?)".*?href="(.*?)".*?
  • ',re.S) result = re.findall(pattern,html)

    pageNum = re.findall(r'?class="zpgi">(.?)',html,re.S)[0]

    info = []
    for i in result:
        data = {}
        data['title'] = i[0]
        url = baseUrl+i[1]
        print url
        data['url'] = url
        data['author'] = i[2]
        data['authorUrl'] = baseUrl+i[3]
        info.append(data)
       getSongSheet(url)
        time.sleep(random.randint(1,10))
        break
        这也是网易云一个有趣的地方,我们在爬取的时候,需要把 # 删了才可这样就可以看到
        ![](https://upload-images.jianshu.io/upload_images/7933544-ba9a4003bde734ac?imageMogr2/auto-orient/strip|imageView2/2/w/951/format/webp)
        **第二步**
        def getSongSheet(url):
    #获取每个歌单里的每首歌的id,作为接下来post获取的关键
    html = getHtml(url)
    result = re.findall(r'
  • (.*?)
  • ',html,re.S) result.pop() musicList = [] for i in result: data = {} headers1 = { 'Referer': 'https://music.163.com/song?id={}'.format(i[0]), 'User-Agent': 'Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.87 Safari/537.36' } musicUrl = baseUrl+'/song?id='+i[0] print musicUrl #歌曲url data['musicUrl'] = musicUrl #歌曲名 data['title'] = i[1] musicList.append(data) postUrl = 'https://music.163.com/weapi/v1/resource/comments/R_SO_4_{}?csrf_token='.format(i[0]) param = { 'params': get_params(1), 'encSecKey': get_encSecKey() } r = requests.post(postUrl,data = param,headers = headers1) total = r.json() # 总评论数 total = int(total['total']) comment_TatalPage = total/20 # 基础总页数 print comment_TatalPage #判断评论页数,有余数则为多一页,整除则正好 if total%20 != 0: comment_TatalPage = comment_TatalPage+1 comment_data,hotComment_data = getMusicComments(comment_TatalPage, postUrl, headers1) #存入数据库的时候若出现ID重复,那么注意爬下来的数据是否只有一个 saveToMongoDB(str(i[1]),comment_data,hotComment_data) print 'End!' else: comment_data, hotComment_data = getMusicComments(comment_TatalPage, postUrl, headers1) saveToMongoDB(str(i[1]),comment_data,hotComment_data) print 'End!' time.sleep(random.randint(1, 10)) break 根据id,构造postUrl 通过对第一页的post(关于如何post得到想要的信息,在后面会讲到),获取评论的总条数,及总页数;

    以及调用获取歌曲评论的方法;
    第三步
    def getMusicComments(comment_TatalPage ,postUrl, headers1):

    commentinfo = []
    hotcommentinfo = []
    # 对每一页评论
    for j in range(1, comment_TatalPage + 1):
        # 热评只在第一页可抓取
        if j == 1:
            #获取评论
            r = getPostApi(j , postUrl, headers1)
            comment_info = r.json()['comments']
            for i in comment_info:
                com_info = {}
                com_info['content'] = i['content']
                com_info['author'] = i['user']['nickname']
                com_info['likedCount'] = i['likedCount']
                commentinfo.append(com_info)
            hotcomment_info = r.json()['hotComments']
            for i in hotcomment_info:
                hot_info = {}
                hot_info['content'] = i['content']
                hot_info['author'] = i['user']['nickname']
                hot_info['likedCount'] = i['likedCount']
                hotcommentinfo.append(hot_info)
        else:
            r = getPostApi(j, postUrl, headers1)
            comment_info = r.json()['comments']
            for i in comment_info:
                com_info = {}
                com_info['content'] = i['content']
                com_info['author'] = i['user']['nickname']
                com_info['likedCount'] = i['likedCount']
                commentinfo.append(com_info)
        print u'第'+str(j)+u'页爬取完毕...'
        time.sleep(random.randint(1,10))
    print commentinfo
    print '\n-----------------------------------------------------------\n'
    print hotcommentinfo
    return commentinfo,hotcommentinfo
    
       

    你可能感兴趣的:(xcode,java,python)