使用py3常见的棘手的问题是,python编译时报错往往是不够准确的,对我这样用惯了java的人来说真是头痛啊,常常遇到这个问题,就记录一下吧,报错如下:
PS C:\Users\jiangcheng\Documents\Python Scripts> scrapy runspider mingyanSpider.py -o mingyan.json
2018-07-10 13:56:04 [scrapy.utils.log] INFO: Scrapy 1.5.0 started (bot: scrapybot)
2018-07-10 13:56:04 [scrapy.utils.log] INFO: Versions: lxml 4.2.1.0, libxml2 2.9.8, cssselect 1.0.3, parsel 1.4.0, w3lib 1.19.0, Twisted 17.5.0, Python 3.6.5 |Anaconda, Inc.| (default, Mar 29 2018, 13:32:41) [MSC v.1900 64 bit (AMD64)], pyOpenSSL 18.0.0 (OpenSSL 1.0.2o 27 Mar 2018), cryptography 2.2.2, Platform Windows-10-10.0.17134-SP0
Traceback (most recent call last):
File "F:\Program\Anaconda3\Scripts\scrapy-script.py", line 10, in
sys.exit(execute())
File "F:\Program\Anaconda3\lib\site-packages\scrapy\cmdline.py", line 150, in execute
_run_print_help(parser, _run_command, cmd, args, opts)
File "F:\Program\Anaconda3\lib\site-packages\scrapy\cmdline.py", line 90, in _run_print_help
func(*a, **kw)
File "F:\Program\Anaconda3\lib\site-packages\scrapy\cmdline.py", line 157, in _run_command
cmd.run(args, opts)
File "F:\Program\Anaconda3\lib\site-packages\scrapy\commands\runspider.py", line 80, in run
module = _import_file(filename)
File "F:\Program\Anaconda3\lib\site-packages\scrapy\commands\runspider.py", line 21, in _import_file
module = import_module(fname)
File "F:\Program\Anaconda3\lib\importlib\__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 994, in _gcd_import
File "", line 971, in _find_and_load
File "", line 955, in _find_and_load_unlocked
File "", line 665, in _load_unlocked
File "", line 674, in exec_module
File "", line 781, in get_code
File "", line 741, in source_to_code
File "", line 219, in _call_with_frames_removed
File "C:\Users\jiangcheng\Documents\Python Scripts\mingyanSpider.py", line 11
'内容':quote01.css('span.text::text').extract_first(),
^
SyntaxError: invalid character in identifier
虽然报的是单词不可用,但是看来看去代码很正常啊,但是本着小范围纠错的原则,在附近思考可能出现的问题,其中标点符号中英文的问题马上映入大脑,奇怪的是中英文冒号在vscode中看起来一模一样,难怪不容易发现。
改完之后完整代码如下:
import scrapy
class mingyanSpider(scrapy.Spider):
name = "quotes"
start_urls = [
'http://lab.scrapyd.cn/',
]
def parse(self,response):
for quote01 in response.css('div.quote'):
yield {
'内容': quote01.css('span.text::text').extract_first(),
'作者': quote01.xpath('span/small/text()').extract_first(),
}
next_page = response.css('li.next a::attr("href")').extract_first()
if next_page is not None:
yield scrapy.Request(next_page,self.parse)
赶紧运行:PS C:\Users\jiangcheng\Documents\Python Scripts> scrapy runspider mingyanSpider.py -o mingyan.json
运行成功了。