爬虫入门

参考博客:爬虫入门系列


简要介绍:

1.用到的Python库:

requests: 主要用于获取网页结果

BeautifulSoup: 主要用于解析网页内容

2.简单例子:

import requests

url = "https://movie.douban.com/cinema/later/chengdu/"

response = requests.get(url)

print(response.content.decode('utf-8'))

from bs4 import BeautifulSoup

soup = BeautifulSoup(response.content.decode('utf-8'),'lxml')

all_movie = soup.find('div',id="showing-soon")

3.数据存储:

对于爬出到的数据可以选择保持到csv, txt等文件中

你可能感兴趣的:(爬虫入门)