scrapy shell 参数

一、--no-redirect

无此参数,默认自动重定向,有此参数就不会重定向了
终端执行:

scrapy shell https://www.shiyanlou.com/user/310176

结果如下:

[s]   request    
[s]   response   <200 https://www.shiyanlou.com/teacher/310176>

终端执行:

scrapy shell --no-redirect https://www.shiyanlou.com/user/310176

结果如下:

[s]   request    
[s]   response   <301 https://www.shiyanlou.com/user/310176>

二、-s

增加参数,常用的参数是 USER_AGENT ,当命令结果出现 403 时,用此参数
终端执行:

scrapy shell https://movie.douban.com/subject/3011091/

结果如下:

[s]   request    
[s]   response   <403 https://movie.douban.com/subject/3011091/>

终端执行:

scrapy shell -s USER_AGENT='Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_1
)AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36' 
https://movie.douban.com/subject/3011091/

结果如下:

[s]   request    
[s]   response   <200 https://movie.douban.com/subject/3011091/>

如果创建了爬虫项目,也可以修改 settings.py 文件中的 USER_AGENT 字段

你可能感兴趣的:(scrapy shell 参数)