爬个小逗图

python小爬虫

python这门胶水语言, 已经是趋势了,使用范围太广,用它做爬虫比Java和OC方便太多, 几行代码就搞定了. 开发里面用它来做代码混淆也很方便.

爬取对象---斗图网

爬个小逗图_第1张图片
image.png

思路就三步骤:

  1. 确定url
  2. 发起请求.
  3. 获取图片保存
import requests
import re

# 1. 确定url
url = 'https://www.doutula.com/photo/list/'
header = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36'
}

# 2. 发起请求
response = requests.get(url,headers = header).text

# 3.获取图片保存
reg = r'data-original="(.*?)"'
image_urls = re.findall(reg, response)

for image_url in image_urls:
   image_name = image_url.split('/')[-1][:-4]
   print(image_name)

   image = requests.get(image_url,headers = header).content
   with open("./images/%s.jpg" % image_name,"wb") as file:
       file.write(image)

最后爬取的图片

爬个小逗图_第2张图片
image.png

你可能感兴趣的:(爬个小逗图)