【爬虫】01-爬斗鱼妹子图

1.准备工作

  • url:“https://www.douyu.com/g_yz”
  • 爬取目标:在这里插入图片描述

2.开始爬取

  • 目录结构【爬虫】01-爬斗鱼妹子图_第1张图片
  • 代码
import requests, re


def get_stable_image(url):
    headers = {
        "User-Agent": "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.110 Safari/537.36"
    }
    # url
    req = requests.get(url=url, headers=headers)
    # print(req.request.headers)  # 测试请求头
    # 响应内容
    html = req.content.decode()
    # print(html)  # 测试网页内容
    # 获取image_url
    # 使用正则匹配目标内容
    reg = r'data-original="(.*?)" src='
    img_url_list = re.findall(reg, html)
    # print(img_url_list)  # 测试img_url

    # 提取数据
    count = 0
    for img_url in img_url_list:
        try:
            # 图片名称
            img_name = img_url.split('/')[-1] + ".jpg"
        except Exception as e:
            print(e)
            continue
        img = requests.get(url=img_url)

        # 存储数据
        with open('images/01.爬取斗鱼图片/'+img_name, 'wb') as f:
            f.write(img.content)
        count += 1
        print("已爬取成功%d张图片" % count)

if __name__ == '__main__':
    ret = get_stable_image("https://www.douyu.com/g_yz")

爬取成功

【爬虫】01-爬斗鱼妹子图_第2张图片

你可能感兴趣的:(爬虫)