Python-Scrapy遇到的问题,报错:FileNotFoundError: [Errno 2] No such file or directory: 'scrapy crawl xxx'

问题一:项目中使用到了爬虫(scrapy)框架已经任务调度框架,在调度过程中报错信息如下:

Traceback (most recent call last):
 File "/usr/local/python3/lib/python3.6/site-packages/apscheduler/executors/base.py", line 125, in run_job
   retval = job.func(*job.args, **job.kwargs)
 File "/root/pyproject/douyin/douyin/main.py", line 20, in tick_challenge
   subprocess.Popen("scrapy crawl categoryVideoSpider")
 File "/usr/local/python3/lib/python3.6/subprocess.py", line 707, in __init__
   restore_signals, start_new_session)
 File "/usr/local/python3/lib/python3.6/subprocess.py", line 1326, in _execute_child
   raise child_exception_type(errno_num, err_msg)
FileNotFoundError: [Errno 2] No such file or directory: 'scrapy crawl xxxspider'


 原因:虽然通过pip安装了scrapy框架,但是没有建立软链接,导致在执行过程中报错。

解决方法:建立软链接即可

ln -s /usr/local/python3/bin/scrapy /usr/bin/scrapy


问题二:在执行爬虫过程中出现

ModuleNotFoundError: No module named '_sqlite3'
原因:没有安装sqlite3.

解决方法:安装sqlite3,如下步骤:

下载sqlite3

wget https://www.sqlite.org/2018/sqlite-autoconf-3230100.tar.gz
解压

tar zxvf sqlite-autoconf-3230100.tar.gz 
在解压后的sqlite的目录中执行

./configure --prefix=/usr/local/sqlite3

执行安装

make && make install
重新编译python,修改setup.py

vim /root/Python-3.6.1/setup.py 


找到sqlite_inc_paths添加下面这条语句

sqlite_inc_paths = [ '/usr/include',
                          '/usr/include/sqlite',
                          '/usr/include/sqlite3',
                          '/usr/local/include',
                          '/usr/local/include/sqlite',
                          '/usr/local/include/sqlite3',
                          '/usr/local/sqlite3/include', # 添加该条信息
                          ]


进入到python目录中执行如下操作

make -j 4
make install 

python3测试是否安装成功,不报错说明执行成功

python3
>>>import sqlite3

问题三:在执行爬虫过程中 scrapy crawl xxxSpider出现了 “Unknown command: crawl“错误

原因:1.项目中没有”scrapy.cfg“这个文件;2.存在文件但是执行的时候没在这个文件所在目录执行

解决方案:

针对1的原因只要在项目根目录中新建文件,命名为scrapy.cfg。文件内写入下面的内容即可


# Automatically created by: scrapy startproject

#

# For more information about the [deploy] section see:


# https://scrapyd.readthedocs.io/en/latest/deploy.html


# xxx为你的项目名

[settings]
default = xxx.settings

[deploy]

#url = http://localhost:6800/

project = xxx


针对2的原因,只要在scrapy.cfg所在目录执行scrapy crawl xxxSpider 即可。但是我们有时候在项目中使用了任务调度,用来执行爬虫的开始,那么就可以在项目中这样写

app_path = os.path.dirname(os.path.realpath(__file__))
subprocess.Popen("scrapy crawl xxxSpider", shell=True, cwd=app_path)

这样就不再局限只能在scrapy.cfg所在目录下执行爬虫了。
 

你可能感兴趣的:(Python-Scrapy遇到的问题,报错:FileNotFoundError: [Errno 2] No such file or directory: 'scrapy crawl xxx')