Python爬虫精进-第3/6关-爬取豆瓣TOP250电影v2.0保存到excel

本代码是在之前代码基础上补充的保存到excel部分代码,额外需要使用openpyxl模块
Pycharm运行通过

import requests, bs4, openpyxl
wb = openpyxl.Workbook()
sheet = wb.active
sheet.title='Movies'
sheet['A1']='序号'
sheet['B1']='电影名'
sheet['C1']='评分'
sheet['D1']='推荐语'
sheet['E1']='链接'

headers={'user-agent':'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36'}
for x in range(10):
    url = 'https://movie.douban.com/top250?start=' + str(x*25) + '&filter='
    res = requests.get(url, headers=headers)
    bs = bs4.BeautifulSoup(res.text, 'html.parser')
    bs = bs.find('ol', class_="grid_view")
    for titles in bs.find_all('li'):
        num = titles.find('em',class_="").text
        title = titles.find('span', class_="title").text
        comment = titles.find('span',class_="rating_num").text
        url_movie = titles.find('a')['href']
        sheet.append([num, title, comment, url_movie])

        if titles.find('span',class_="inq") != None:
            tes = titles.find('span',class_="inq").text
            print(num + '.' + title + '——' + comment + '\n' + '推荐语:' + tes +'\n' + url_movie)
        else:
            print(num + '.' + title + '——' + comment + '\n' +'\n' + url_movie)
wb.save('Movies.xlsx')

你可能感兴趣的:(Python,#爬虫学习)