urllib2.URLError: urlopen error [Errno 111] Connection refused

记录个还没解决的问题。下面爬虫代码是可以执行的,但是在我的Ubuntu的虚拟中刚开始是可以运行的,但是,后来不知道改了什么东西,用urllib2写的爬虫和用scrapy 的爬虫代码都不能运行了!!。

import urllib2
import re

class Spider:
    def __init__(self):
        self.page = 1
        self.switch = True
    
    def loadPage(self):
        print 'loadPage'
        url = "http://www.neihan8.com/article/list_5_" + str(self.page) + ".html"
        headers = {"User-Agent" : "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36"}
        
        request = urllib2.Request(url, headers=headers)
        response = urllib2.urlopen(request)
        
        html = response.read()
        gbk_html = html.decode('gbk').encode('utf-8')
        pattern = re.compile('(.*?)
', re.S) content_list = pattern.findall(gbk_html) self.dealPage(content_list) def dealPage(self, content_list): for item in content_list: item = item.replace('
', '').replace('

', '').replace('

', '') self.writePage(item) def writePage(self, item): with open('duanzi.txt', 'a') as f: f.write(item) def startWork(self): while self.switch: self.loadPage() command = raw_input('please enter continue, q back') if command == 'q': self.switch = False self.page += 1 print '3q use' if __name__ == '__main__': s = Spider() s.startWork()

 爬虫结果

urllib2.URLError: urlopen error [Errno 111] Connection refused_第1张图片

在终端下的错误信息。

Traceback (most recent call last):
  File "01-neihan.py", line 44, in 
    s.startWork()
  File "01-neihan.py", line 34, in startWork
    self.loadPage()
  File "01-neihan.py", line 15, in loadPage
    response = urllib2.urlopen(request)
  File "/usr/lib/python2.7/urllib2.py", line 154, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/lib/python2.7/urllib2.py", line 429, in open
    response = self._open(req, data)
  File "/usr/lib/python2.7/urllib2.py", line 447, in _open
    '_open', req)
  File "/usr/lib/python2.7/urllib2.py", line 407, in _call_chain
    result = func(*args)
  File "/usr/lib/python2.7/urllib2.py", line 1228, in http_open
    return self.do_open(httplib.HTTPConnection, req)
  File "/usr/lib/python2.7/urllib2.py", line 1198, in do_open
    raise URLError(err)
urllib2.URLError: 

出现这个问题之前是我想fangqiang, 所以设置了代理,后来把代理关了也不行。这问题目前还没解决,不知道是Ubuntu的环境问题,还是python的问题。

问题:定位出问题,确实是代理的问题了。

解决办法:

1、首先查看下 /etc/apt/apt.conf,发现里面里面有:

http_proxy="http://192.168.16.109:13128/"
https_proxy="https://192.168.16.109:13128/"

也许内容和我的不一样。然后删除这个文件,然后重启电脑,发现里面还没有解决。

2、查看一下:cat /etc/enviroment,发现有配置

http_proxy="http://192.168.16.109:13128/"
https_proxy="https://192.168.16.109:13128/"

把里面的配置文件删除了,(切记PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games"这一行不要删除,不然Ubuntu开机就不能进入到桌面了)

3、然后重启电脑,执行代码,问题解决。

 

 

 

你可能感兴趣的:(Python)