阅读更多
Spiderkeeper 2019(1)Installation and Introduction
> python -V
Python 2.7.5
Since we want to migrate all the things to Python3, install and prepare python3
Install PYENV from the latest github
> git clone https://github.com/pyenv/pyenv.git ~/.pyenv
Add to the PATH
> vi ~/.bash_profile
PATH=$PATH:$HOME/.pyenv/bin
eval "$(pyenv init -)"
> . ~/.bash_profile
Check installation
> pyenv -v
pyenv 1.2.13-2-g0aeeb6f
Install the Python3 and Python2
> pyenv install 3.6.0
> pyenv install 2.7.10
It will take quite some time, but after the installation is fine
> pyenv versions
* system (set by /home/carl/.pyenv/version)
2.7.10
3.6.0
Globally choose 3.6.0
> pyenv global 3.6.0
> python -V
Python 3.6.0
Check all the versions, latest is 3.7.4, install some other versions
https://www.python.org/downloads/
> pyenv install 3.6.9
> pyenv install 3.7.4
Fails with ModuleNotFoundError: No module named '_ctypes'
Solution:
https://github.com/pyenv/pyenv/issues/1183
> sudo yum install libffi-devel
> pyenv install 3.7.4
> pyenv global 3.7.4
> python -V
Python 3.7.4
> pip -V
pip 19.0.3 from /home/carl/.pyenv/versions/3.7.4/lib/python3.7/site-packages/pip (python 3.7)
Upgrade the pip
> pip install --upgrade pip
> pip -V
pip 19.2.3 from /home/carl/.pyenv/versions/3.7.4/lib/python3.7/site-packages/pip (python 3.7)
Install spider keeper
> pip install spiderkeeper
Then we are ready to start the spiderkeeper as follow:
> spiderkeeper --server=http://xxx.xx.xx.xx:6801 --server=http://xx.xx.xx.xx:7800 --port=5500 --username=username --password=password\!123
Error Message
ModuleNotFoundError: No module named 'pysqlite2'
Solution:
https://stackoverflow.com/questions/29770906/importerror-no-module-named-pysqlite2
https://github.com/jupyterhub/jupyterhub/issues/464
Install sqlite-devel and then reinstall python 3.7.4 and try again
> sudo yum install sqlite-devel
> pyenv uninstall 3.7.4
> pyenv install 3.7.4
WARNING: The Python bz2 extension was not compiled. Missing the bzip2 lib?
WARNING: The Python readline extension was not compiled. Missing the GNU readline lib?
Then it works pretty well
> spiderkeeper --server=http://xxx.xx.xx.xx:6801 --server=http://xx.xx.xx.xx:7800 --port=5500 --username=username --password=password\!123
Maybe fix the WARNING as well.
https://stackoverflow.com/questions/12806122/missing-python-bz2-module
https://github.com/pyenv/pyenv/wiki/common-build-problems
> sudo yum install bzip2-devel
> sudo yum install readline-devel
Remove the python3 in Pyenv and re-install again.
> pyenv uninstall 3.7.4
> pyenv install 3.7.4
> pip install --upgrade pip
> pip install spiderkeeper
Start the service
>spiderkeeper --server=http://xxx.xx.xx.xx:6801 --server=http://xx.xx.xx.xx:7800 --port=5500 --username=username --password=password\!123
Then I can visit the UI from here
http://centos-dev1:5500/project/manage
Then we can create project, generate egg file and upload our spider to that platform.
Some other installation command I may need
scrapy need for deploying spider
> pip install scrapy
scrapyd need for running the final spider there
> pip install scrapyd
scrapyd-client is needed for deploy or generate the egg file
> pip install scrapyd-client
References:
https://github.com/pyenv/pyenv#installation
https://www.jianshu.com/p/88ddeac92a6d
https://stackoverflow.com/questions/33321312/cannot-switch-python-with-pyenv