2019-11-16 qq音乐评论

目标网址:https://y.qq.com/n/yqq/song/001qvvgF38HVc4.html#comment_box
qq音乐 周杰伦的说好不哭
打开charles,刷新网页,复制评论在charles里查找,很容易找到评论接口

image.png

https://c.y.qq.com/base/fcgi-bin/fcg_global_comment_h5.fcg?g_tk=160454710&loginUin=1808163167&hostUin=0&format=json&inCharset=utf8&outCharset=GB2312¬ice=0&platform=yqq.json&needNewCode=0&cid=205360772&reqtype=2&biztype=1&topid=237773700&cmd=8&needmusiccrit=0&pagenum=0&pagesize=25&lasthotcommentid=&domain=qq.com&ct=24&cv=10101010

评论翻页,发现只有page,lasthotcommentid参数改变了,page即页数,lasthotcommentid为最后一条评论id

https://c.y.qq.com/base/fcgi-bin/fcg_global_comment_h5.fcg?g_tk=160454710&loginUin=1808163167&hostUin=0&format=json&inCharset=utf8&outCharset=GB2312¬ice=0&platform=yqq.json&needNewCode=0&cid=205360772&reqtype=2&biztype=1&topid=237773700&cmd=8&needmusiccrit=0&pagenum=1&pagesize=25&lasthotcommentid=song_237773700_3559701714_1573875409&domain=qq.com&ct=24&cv=10101010

于是直接循环请求:

page=0
lasthotcommentid=''
while 1:

    url='https://c.y.qq.com/base/fcgi-bin/fcg_global_comment_h5.fcg?g_tk=160454710&loginUin=1808163167&hostUin=0&format=json&inCharset=utf8&outCharset=GB2312¬ice=0&platform=yqq.json&needNewCode=0&cid=205360772&reqtype=2&biztype=1&topid=237773700&cmd=8&needmusiccrit=0&pagenum=%s&pagesize=25&lasthotcommentid=%s&domain=qq.com&ct=24&cv=10101010'%(page,lasthotcommentid)
    response=requests.get(url,verify=False)
    jsno_data=json.loads(response.text)
    print(jsno_data)
    commentsArr=jsno_data['comment']['commentlist']
    commenttotal=jsno_data['comment']['commenttotal']
    print('共有%s条评论'%commenttotal)
    page+=1
    break

评论格式如图,处理评论并保存


image.png
def saveComments(commentsArr):
    for comment in commentsArr:
        nick=comment['nick']
        rootcommentcontent=comment['rootcommentcontent']
        compile=re.compile(r'\[em].*[/em].',re.S)
        c=re.sub(compile,'',rootcommentcontent)
        f.write(nick+'----'+c+'\n')

结果:

image.png

完整代码:https://github.com/Liangjianghao/everyDay_spider.git qqMusic_comments

你可能感兴趣的:(2019-11-16 qq音乐评论)