python爬虫 -- 安装Scrapy开发环境

原文地址

强烈建议在Linux环境下进行开发,Windows环境会出现许多莫名其妙的问题

CentOS7安装Scrapy

首先安装libxml依赖libxml2,libxml2主要提供解析xpath的组件:

yum install libxml2 libxml2-devel

然后安装scrapy:

pip install scrapy

执行如下:

(scrapy_venv) [liuyuantao@localhost venv-repo]$ pip install scrapy
Collecting scrapy
  Downloading Scrapy-1.1.2-py2.py3-none-any.whl (295kB)
    100% |████████████████████████████████| 296kB 263kB/s 
Collecting cssselect>=0.9 (from scrapy)
  Downloading cssselect-0.9.2-py2.py3-none-any.whl
Collecting six>=1.5.2 (from scrapy)
  Downloading six-1.10.0-py2.py3-none-any.whl
Collecting Twisted>=10.0.0 (from scrapy)
  Downloading Twisted-16.4.1.tar.bz2 (3.0MB)
    100% |████████████████████████████████| 3.0MB 318kB/s 
Collecting queuelib (from scrapy)
  Downloading queuelib-1.4.2-py2.py3-none-any.whl
Collecting service-identity (from scrapy)
  Downloading service_identity-16.0.0-py2.py3-none-any.whl
Collecting parsel>=0.9.3 (from scrapy)
  Downloading parsel-1.0.3-py2.py3-none-any.whl
Collecting PyDispatcher>=2.0.5 (from scrapy)
  Downloading PyDispatcher-2.0.5.tar.gz
Collecting lxml (from scrapy)
  Downloading lxml-3.6.4-cp27-cp27mu-manylinux1_x86_64.whl (4.2MB)
    100% |████████████████████████████████| 4.2MB 148kB/s 
Collecting pyOpenSSL (from scrapy)
  Downloading pyOpenSSL-16.1.0-py2.py3-none-any.whl (43kB)
    100% |████████████████████████████████| 51kB 6.8MB/s 
Collecting w3lib>=1.14.2 (from scrapy)
  Downloading w3lib-1.15.0-py2.py3-none-any.whl
Collecting zope.interface>=3.6.0 (from Twisted>=10.0.0->scrapy)
  Downloading zope.interface-4.3.2.tar.gz (143kB)
    100% |████████████████████████████████| 143kB 99kB/s 
Collecting attrs (from service-identity->scrapy)
  Downloading attrs-16.2.0-py2.py3-none-any.whl
Collecting pyasn1-modules (from service-identity->scrapy)
  Downloading pyasn1_modules-0.0.8-py2.py3-none-any.whl
Collecting pyasn1 (from service-identity->scrapy)
  Downloading pyasn1-0.1.9-py2.py3-none-any.whl
Collecting cryptography>=1.3.4 (from pyOpenSSL->scrapy)
  Downloading cryptography-1.5.tar.gz (400kB)
    100% |████████████████████████████████| 409kB 294kB/s 
Requirement already satisfied (use --upgrade to upgrade): setuptools in ./scrapy_venv/lib/python2.7/site-packages (from zope.interface>=3.6.0->Twisted>=10.0.0->scrapy)
Collecting idna>=2.0 (from cryptography>=1.3.4->pyOpenSSL->scrapy)
  Downloading idna-2.1-py2.py3-none-any.whl (54kB)
    100% |████████████████████████████████| 61kB 2.4MB/s 
Collecting enum34 (from cryptography>=1.3.4->pyOpenSSL->scrapy)
  Downloading enum34-1.1.6-py2-none-any.whl
Collecting ipaddress (from cryptography>=1.3.4->pyOpenSSL->scrapy)
  Downloading ipaddress-1.0.17-py2-none-any.whl
Collecting cffi>=1.4.1 (from cryptography>=1.3.4->pyOpenSSL->scrapy)
  Downloading cffi-1.8.3-cp27-cp27mu-manylinux1_x86_64.whl (386kB)
    100% |████████████████████████████████| 389kB 255kB/s 
Collecting pycparser (from cffi>=1.4.1->cryptography>=1.3.4->pyOpenSSL->scrapy)
  Downloading pycparser-2.14.tar.gz (223kB)
    100% |████████████████████████████████| 225kB 1.8MB/s 
Building wheels for collected packages: Twisted, PyDispatcher, zope.interface, cryptography, pycparser
  Running setup.py bdist_wheel for Twisted ... done
  Stored in directory: /home/liuyuantao/.cache/pip/wheels/0e/53/62/e7b4cea7df9113fb2818b224eb5d143be981568d9c43057a0a
  Running setup.py bdist_wheel for PyDispatcher ... done
  Stored in directory: /home/liuyuantao/.cache/pip/wheels/86/02/a1/5857c77600a28813aaf0f66d4e4568f50c9f133277a4122411
  Running setup.py bdist_wheel for zope.interface ... done
  Stored in directory: /home/liuyuantao/.cache/pip/wheels/8c/57/fc/dd66620d3ad2b0e587710faee345ebfd6b75329ebb780df703
  Running setup.py bdist_wheel for cryptography ... done
  Stored in directory: /home/liuyuantao/.cache/pip/wheels/d4/98/43/a428a8aed7285f934d18efd787647455d7ef9a9dda81f22839
  Running setup.py bdist_wheel for pycparser ... done
  Stored in directory: /home/liuyuantao/.cache/pip/wheels/9b/f4/2e/d03e949a551719a1ffcb659f2c63d8444f4df12e994ce52112
Successfully built Twisted PyDispatcher zope.interface cryptography pycparser
Installing collected packages: cssselect, six, zope.interface, Twisted, queuelib, attrs, pyasn1, pyasn1-modules, idna, enum34, ipaddress, pycparser, cffi, cryptography, pyOpenSSL, service-identity, lxml, w3lib, parsel, PyDispatcher, scrapy
Successfully installed PyDispatcher-2.0.5 Twisted-16.4.1 attrs-16.2.0 cffi-1.8.3 cryptography-1.5 cssselect-0.9.2 enum34-1.1.6 idna-2.1 ipaddress-1.0.17 lxml-3.6.4 parsel-1.0.3 pyOpenSSL-16.1.0 pyasn1-0.1.9 pyasn1-modules-0.0.8 pycparser-2.14 queuelib-1.4.2 scrapy-1.1.2 service-identity-16.0.0 six-1.10.0 w3lib-1.15.0 zope.interface-4.3.2

Linux环境的scrapy安装完毕。

Windows安装Scrapy

通常在Windows下面安装失败的原因是因为lxml,虽然可以用pip安装lxml,但因为lxml有很多依赖的软件,其他系统都是自带的,但Windows没有,所以我们还是老老实实使用lxml专门为Windows提供的安装包来安装。

首先需要安装pywin32,下载地址,下载完点击安装即可。

安装OpenSSL

pip install pyOpenSSL

重点!!!lxml的官方提供了whl的安装包,在lxml的官方可以找到或者点击下载地址即可,然后执行(最好是使用管理员的cmd执行)

pip install lxml-3.6.4-cp27-cp27m-win_amd64.whl

最后我们执行

pip install Scrapy

python 爬虫 -- ImportError: no module named win32api

解决的一个办法(安装相关模块):

pip install pypiwin32

你可能感兴趣的:(python,--,爬虫)