Python 爬虫

import urllib.request
from bs4 import BeautifulSoup

url = ‘https://baidu.com/’

data = urllib.request.urlopen(url).read()
page_data = data.decode(‘utf-8’)

soup = BeautifulSoup(page_data, ‘lxml’)

// 样例一
title = soup.find(‘title’)
print(title)

// 样例二
metas = soup.findAll(‘meta’)
for mete in metas:
if mete.get(‘name’) == ‘description’:
print(mete.get(‘content’))

你可能感兴趣的:(大数据hadoop,python,爬虫,开发语言)