Python实战作业1-4:获取动态网页数据

任务:

获取网站:https://knewone.com/discover?page= 前20页图片链接并下载至本地

成果:

Python实战作业1-4:获取动态网页数据_第1张图片
Snip20170524_1.png

代码:

from bs4 import BeautifulSoup
import requests,urllib.request

folderPath = '/Users/FS/Desktop/test/'
urls = ['https://knewone.com/discover?page={}'.format(str(i)) for i in range(1,15)]

imageUrls = []
for url in urls:
    print(url)
    wb_data = requests.get(url)
    soup = BeautifulSoup(wb_data.text, 'lxml')
    images = soup.select('#wrapper > div > section > div > div.hits_group-things.clearfix > article > header > a > img')
    for image in images:
        url = image.get('src')
        imageUrls.insert(-1,url.split('!')[0])
        print(url)

for imageUrl in imageUrls:
    urllib.request.urlretrieve(imageUrl,folderPath+imageUrl[-10:])
    print('Done')

你可能感兴趣的:(Python实战作业1-4:获取动态网页数据)