我们可以在控制台执行 scrapy bench
命令时 出现错误。(之前安装了pywin32库)
G:\Workspaces\python_wrok\WorkMain>scrapy bench
2018-10-09 13:22:36 [scrapy.utils.log] INFO: Scrapy 1.5.0 started (bot: scrapybot)
2018-10-09 13:22:36 [scrapy.utils.log] INFO: Versions: lxml 4.1.1.0, libxml2 2.9.5, cssselect 1.0.3, parsel 1.3.1, w3lib 1.19.0, Twisted 17.9.0, Python 3.6.2 (v3.6.2:5fd33b5, Jul 8 2017, 04:57:36) [MSC v.1900 64 bit (AMD64)],
pyOpenSSL 17.5.0 (OpenSSL 1.1.0g 2 Nov 2017), cryptography 2.1.4, Platform Windows-10-10.0.17134-SP0
2018-10-09 13:22:38 [scrapy.crawler] INFO: Overridden settings: {'CLOSESPIDER_TIMEOUT': 10, 'LOGSTATS_INTERVAL': 1, 'LOG_LEVEL': 'INFO'}
2018-10-09 13:22:38 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
'scrapy.extensions.telnet.TelnetConsole',
'scrapy.extensions.closespider.CloseSpider',
'scrapy.extensions.logstats.LogStats']
Unhandled error in Deferred:
2018-10-09 13:22:38 [twisted] CRITICAL: Unhandled error in Deferred:
2018-10-09 13:22:38 [twisted] CRITICAL:
Traceback (most recent call last):
File "d:\program files (x86)\python\python36\lib\site-packages\twisted\internet\defer.py", line 1386, in _inlineCallbacks
result = g.send(result)
File "d:\program files (x86)\python\python36\lib\site-packages\scrapy\crawler.py", line 80, in crawl
self.engine = self._create_engine()
File "d:\program files (x86)\python\python36\lib\site-packages\scrapy\crawler.py", line 105, in _create_engine
return ExecutionEngine(self, lambda _: self.stop())
File "d:\program files (x86)\python\python36\lib\site-packages\scrapy\core\engine.py", line 69, in __init__
self.downloader = downloader_cls(crawler)
File "d:\program files (x86)\python\python36\lib\site-packages\scrapy\core\downloader\__init__.py", line 88, in __init__
self.middleware = DownloaderMiddlewareManager.from_crawler(crawler)
File "d:\program files (x86)\python\python36\lib\site-packages\scrapy\middleware.py", line 58, in from_crawler
return cls.from_settings(crawler.settings, crawler)
File "d:\program files (x86)\python\python36\lib\site-packages\scrapy\middleware.py", line 34, in from_settings
mwcls = load_object(clspath)
File "d:\program files (x86)\python\python36\lib\site-packages\scrapy\utils\misc.py", line 44, in load_object
mod = import_module(module)
File "d:\program files (x86)\python\python36\lib\importlib\__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 978, in _gcd_import
File "", line 961, in _find_and_load
File "", line 950, in _find_and_load_unlocked
File "", line 655, in _load_unlocked
File "", line 678, in exec_module
File "", line 205, in _call_with_frames_removed
File "d:\program files (x86)\python\python36\lib\site-packages\scrapy\downloadermiddlewares\retry.py", line 20, in
from twisted.web.client import ResponseFailed
File "d:\program files (x86)\python\python36\lib\site-packages\twisted\web\client.py", line 42, in
from twisted.internet.endpoints import HostnameEndpoint, wrapClientTLS
File "d:\program files (x86)\python\python36\lib\site-packages\twisted\internet\endpoints.py", line 41, in
from twisted.internet.stdio import StandardIO, PipeAddress
File "d:\program files (x86)\python\python36\lib\site-packages\twisted\internet\stdio.py", line 30, in
from twisted.internet import _win32stdio
File "d:\program files (x86)\python\python36\lib\site-packages\twisted\internet\_win32stdio.py", line 9, in
import win32api
ImportError: DLL load failed: 找不到指定的模块。
参考网站:
http://blog.csdn.net/mtt_sky/article/details/50445938
http://blog.sina.com.cn/s/blog_5a81b7990101l225.html
找到我们安装python的文件夹,在Lib文件中找到site-packages\pywin32_system32
D:\Program Files (x86)\Python\Python36\Lib\site-packages\pywin32_system32
把里面的所有的文件复制到:C:\Windows\System32
现在,问题解决。无需重新打开DOS窗口,直接执行:scrapy bench
。
输出正常:
G:\Workspaces\python_wrok\WorkMain>scrapy bench
2018-10-09 13:29:25 [scrapy.utils.log] INFO: Scrapy 1.5.0 started (bot: scrapybot)
2018-10-09 13:29:25 [scrapy.utils.log] INFO: Versions: lxml 4.1.1.0, libxml2 2.9.5, cssselect 1.0.3, parsel 1.3.1, w3lib 1.19.0, Twisted 17.9.0, Python 3.6.2 (v3.6.2:5fd33b5, Jul 8 2017, 04:57:36) [MSC v.1900 64 bit (AMD64)],
pyOpenSSL 17.5.0 (OpenSSL 1.1.0g 2 Nov 2017), cryptography 2.1.4, Platform Windows-10-10.0.17134-SP0
2018-10-09 13:29:27 [scrapy.crawler] INFO: Overridden settings: {'CLOSESPIDER_TIMEOUT': 10, 'LOGSTATS_INTERVAL': 1, 'LOG_LEVEL': 'INFO'}
2018-10-09 13:29:27 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
'scrapy.extensions.telnet.TelnetConsole',
'scrapy.extensions.closespider.CloseSpider',
'scrapy.extensions.logstats.LogStats']
2018-10-09 13:29:28 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
'scrapy.downloadermiddlewares.retry.RetryMiddleware',
'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
'scrapy.downloadermiddlewares.stats.DownloaderStats']
2018-10-09 13:29:28 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
'scrapy.spidermiddlewares.referer.RefererMiddleware',
'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
'scrapy.spidermiddlewares.depth.DepthMiddleware']
2018-10-09 13:29:28 [scrapy.middleware] INFO: Enabled item pipelines:
[]
2018-10-09 13:29:28 [scrapy.core.engine] INFO: Spider opened
2018-10-09 13:29:28 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2018-10-09 13:29:29 [scrapy.extensions.logstats] INFO: Crawled 45 pages (at 2700 pages/min), scraped 0 items (at 0 items/min)
2018-10-09 13:29:30 [scrapy.extensions.logstats] INFO: Crawled 101 pages (at 3360 pages/min), scraped 0 items (at 0 items/min)
2018-10-09 13:29:31 [scrapy.extensions.logstats] INFO: Crawled 157 pages (at 3360 pages/min), scraped 0 items (at 0 items/min)
2018-10-09 13:29:32 [scrapy.extensions.logstats] INFO: Crawled 205 pages (at 2880 pages/min), scraped 0 items (at 0 items/min)
2018-10-09 13:29:33 [scrapy.extensions.logstats] INFO: Crawled 245 pages (at 2400 pages/min), scraped 0 items (at 0 items/min)
2018-10-09 13:29:34 [scrapy.extensions.logstats] INFO: Crawled 293 pages (at 2880 pages/min), scraped 0 items (at 0 items/min)
2018-10-09 13:29:35 [scrapy.extensions.logstats] INFO: Crawled 333 pages (at 2400 pages/min), scraped 0 items (at 0 items/min)
2018-10-09 13:29:36 [scrapy.extensions.logstats] INFO: Crawled 365 pages (at 1920 pages/min), scraped 0 items (at 0 items/min)
2018-10-09 13:29:37 [scrapy.extensions.logstats] INFO: Crawled 397 pages (at 1920 pages/min), scraped 0 items (at 0 items/min)
2018-10-09 13:29:38 [scrapy.extensions.logstats] INFO: Crawled 429 pages (at 1920 pages/min), scraped 0 items (at 0 items/min)
2018-10-09 13:29:38 [scrapy.core.engine] INFO: Closing spider (closespider_timeout)
2018-10-09 13:29:39 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 182259,
'downloader/request_count': 445,
'downloader/request_method_count/GET': 445,
'downloader/response_bytes': 1206761,
'downloader/response_count': 445,
'downloader/response_status_count/200': 445,
'finish_reason': 'closespider_timeout',
'finish_time': datetime.datetime(2018, 10, 9, 5, 29, 39, 237614),
'log_count/INFO': 17,
'request_depth_max': 16,
'response_received_count': 445,
'scheduler/dequeued': 445,
'scheduler/dequeued/memory': 445,
'scheduler/enqueued': 8901,
'scheduler/enqueued/memory': 8901,
'start_time': datetime.datetime(2018, 10, 9, 5, 29, 28, 362686)}
2018-10-09 13:29:39 [scrapy.core.engine] INFO: Spider closed (closespider_timeout)
G:\Workspaces\python_wrok\WorkMain>