Python爬取pilipili排行榜

1. Python爬取pilipili排行榜

  1. 安装requests和beautifulsoup4
  2. 创建一个python文件引入库文件
  3. 利用requests的方法拿到html文档
  4. 通过bs4对html文档进行解析
  5. 将解析的结果写入到一个文件中

1.1 安装requests和beautifulsoup4

1.1.1 使用pycharm安装requests

Python爬取pilipili排行榜_第1张图片

1.1.2 安装beautifulsoup4

同样使用pycharm安装beautifulsoup

Python爬取pilipili排行榜_第2张图片

1.2 创建一个python文件引入库文件

将requests、Beautifulsoup引入,使用==request.get()==方法获取文旦,利用Beautifulsoup进行解析

import requests
from bs4 import BeautifulSoup

url = "https://www.bilibili.com/v/popular/rank/all"
page = requests.get(url)
soup = BeautifulSoup(page.content, "html.parser")

1.3 爬取数据并写入到文本文档中

import requests
from bs4 import BeautifulSoup

url = "https://www.bilibili.com/v/popular/rank/all"
page = requests.get(url)

soup = BeautifulSoup(page.content, "html.parser")

title = soup.title.text

all_products = []

products = soup.select("li.rank-item")

for product in products:
    rank = product.select("div.num")[0].text
    name = product.select("div.info > a")[0].text.strip()
    play = product.select("span.data-box")[0].text.strip()
    comment = product.select("span.data-box")[1].text.strip()
    up = product.select("span.data-box")[2].text.strip()
    url = product.select("div.info > a")[0].attrs['href'].strip()

    all_products.append(
        {
     
            "视频排名": rank,
            "视频名称": name,
            "播放量": play,
            "弹幕量": comment,
            "up主": up,
            "视频链接": url
        }
    )

with open("bili.txt", "w+", encoding="utf-8-sig") as f:
    for i in range(0, len(all_products)):
        for k, v in all_products[i].items():
            f.write("{},{}\n".format(k, v))
        f.write("--------------------------\n")

1.4 运行结果

Python爬取pilipili排行榜_第3张图片

你可能感兴趣的:(笔记,python)