Python3.0 对BeautifulSoup的兼容性不是特别好,安装后 使用import bs4 from BeautifulSoup
1. 先下载beautifulSoup 点击打开链接
https://www.crummy.com/software/BeautifulSoup/bs4/download/
python setup,py install 安装 如果装在C盘 最好用 管理员身份打开
2.
输入python
然后输入 from bs4 import BeautifulSoup
出现异常:
Windows下安装BeautifulSoup4显示'You are trying to run the Python 2 version of Beautiful Soup under Python 3.(`python setup.py install`) or by running 2to3 (`2to3 -w bs4`).'
beautifulsoup4解压目录(beautifulsoup4-4.6.0\bs4)和 2to3.py(D:\Python安装目录\Tools\scripts\)复制到python的安装目录下的Lib(D:\Python安装目录\Lib)文件夹下
执行命令:
Python 2to3.py-w bs4
如何使用参考:点击打开链接 点击打开链接
#!/usr/bin/env python
#coding:utf-8
# 根据易迅网的商品ID,爬取商品价格信息。
# By Tsing
# Python 2.7.9
import urllib.request as request
from bs4 import BeautifulSoup
def get_yixun(id):
price_origin,price_sale = '0','0'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.6) Gecko/20091201 Firefox/3.5.6'
}
#url = 'http://item.yixun.com/item-' + id + '.html'
url ='http://baidu.com'
req = request.Request(url=url, headers=headers)
html = request.urlopen(req).read().decode('utf-8')
#print(html)
soup = BeautifulSoup(html,'lxml')
print('soup')
print(soup.prettify())
print("class")
print(soup.div)
# title = request.unicode(soup.title.text.strip().strip(u'【价格_报价_图片_行情】-易迅网').replace(u'】','')).encode('utf-8').decode('utf-8')
# print(title)
try:
soup_origin = soup.find("dl", { "class" : "xbase_item xprice xprice_origin" })
price_origin = soup_origin.find("span", { "class" : "mod_price xprice_val" }).contents[1].text
print( u'原价:' + price_origin)
except:
pass
try:
soup_sale= soup.find('dl',{'class':'xbase_item xprice'})
price_sale = soup_sale.find("span", { "class" : "mod_price xprice_val" }).contents[1]
print (u'现价:'+ price_sale)
except:
pass
print(url)
return None
if __name__ == '__main__':
get_yixun('2189654')