Python：requests.exceptions.ConnectionError。超过 url 的最大重试次数

Question

提问by BigBoy1337

This is the script:

这是脚本：

import requests
import json
import urlparse
from requests.adapters import HTTPAdapter

s = requests.Session()
s.mount('http://', HTTPAdapter(max_retries=1))

with open('proxies.txt') as proxies:
    for line in proxies:
        proxy=json.loads(line)

    with open('urls.txt') as urls:
        for line in urls:

            url=line.rstrip()
            data=requests.get(url, proxies=proxy)
            data1=data.content
            print data1
            print {'http': line}

as you can see, its trying to access a list of urls through a list of proxies. Here is the urls.txt file:

如您所见，它试图通过代理列表访问 url 列表。这是 urls.txt 文件：

http://api.exip.org/?call=ip

here is the proxies.txt file:

这是 proxies.txt 文件：

{"http":"http://107.17.92.18:8080"}

I got this proxy at www.hidemyass.com. Could it be a bad proxy? I have tried several and this is the result. Note: if you are trying to replicate this, you may have to update the proxy to a recent one at hidemyass.com. They seem to stop working eventually. here is the full error and traceback:

我在 www.hidemyass.com 上得到了这个代理。它可能是一个糟糕的代理吗？我已经尝试了几个，这就是结果。注意：如果您尝试复制此设置，您可能需要在 hidemyass.com 上将代理更新为最近的代理。他们似乎最终停止工作。这是完整的错误和回溯：

Traceback (most recent call last):
  File "test.py", line 17, in <module>
    data=requests.get(url, proxies=proxy)
  File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 55, in get
    return request('get', url, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 44, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 335, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 454, in send
    history = [resp for resp in gen] if allow_redirects else []
  File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 144, in resolve_redirects
    allow_redirects=False,
  File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 438, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/requests/adapters.py", line 327, in send
    raise ConnectionError(e)
requests.exceptions.ConnectionError: HTTPConnectionPool(host=u'219.231.143.96', port=18186): Max retries exceeded with url: http://www.google.com/ (Caused by <class 'httplib.BadStatusLine'>: '')

Answer 1

采纳答案by Eugene Loy

Looking at stack trace you've provided your error is caused by httplib.BadStatusLineexception, which, according to docs, is:

查看您提供的堆栈跟踪，您的错误是由httplib.BadStatusLine异常引起的，根据docs，它是：

Raised if a server responds with a HTTP status code that we don't understand.

如果服务器以我们不理解的 HTTP 状态代码响应，则引发。

In other words something that is returned (if returned at all) by proxy server cannot be parsed by httplib that does actual request.

换句话说，代理服务器返回的内容（如果有的话）无法被执行实际请求的 httplib 解析。

From my experience with (writing) http proxies I can say that some implementations may not follow specs too strictly (rfc specs on http aren't easy reading actually) or use hacks to fix old browsers that have flaws in their implementation.

根据我（编写）http 代理的经验，我可以说某些实现可能不会太严格地遵循规范（http 上的 rfc 规范实际上并不容易阅读）或使用 hack 来修复在其实现中存在缺陷的旧浏览器。

So, answering this:

所以，回答这个：

Could it be a bad proxy?

它可能是一个糟糕的代理吗？

... I'd say - that this is possible. The only real way to be sure is to see what is returned by proxy server.

......我会说 - 这是可能的。唯一真正确定的方法是查看代理服务器返回的内容。

Try to debug it with debugger or grab packet sniffer (something like Wiresharkor Network Monitor) to analyze what happens in the network. Having info about what exactly is returned by proxy server should give you a key to solve this issue.

尝试使用调试器或抓取数据包嗅探器（类似于Wireshark或Network Monitor）对其进行调试，以分析网络中发生的情况。了解代理服务器返回的确切信息应该为您提供解决此问题的关键。

Answer 2

回答by Eugene Loy

Maybe you are overloading the proxy server by sending too much requests in a short period of time, you say that you got the proxy from a popular free proxy website which means that you're not the only one using that server and it's often under heavy load.

也许您通过在短时间内发送过多请求而使代理服务器过载，您说您从一个流行的免费代理网站获得了代理，这意味着您不是唯一使用该服务器的人，而且它通常很重加载。

If you add some delay between your requests like this :

如果您在这样的请求之间添加一些延迟：

from time import sleep

[...]

data=requests.get(url, proxies=proxy)
data1=data.content
print data1
print {'http': line}
sleep(1)

(note the sleep(1)which pauses the execution of the code for one second)

（注意sleep(1)暂停代码执行一秒钟）

Does it work ?

它有效吗？

Answer 3

回答by Ashu

def hello(self):
    self.s = requests.Session()
    self.s.headers.update({'User-Agent': self.user_agent})
    return True

Try this,It worked for me :)

试试这个，它对我有用:)

Python：requests.exceptions.ConnectionError。超过 url 的最大重试次数

提问by BigBoy1337

采纳答案by Eugene Loy

回答by Eugene Loy

回答by Ashu

相关推荐

最近更新

标签

Python：requests.exceptions.ConnectionError。超过 url 的最大重试次数

提问by BigBoy1337

采纳答案by Eugene Loy

回答by Eugene Loy

回答by Ashu

相关推荐

Python 在 PySpark 数据框中添加列总和作为新列

Python 访问列表列表中的项目

Python NameError: 名称 '_name_' 未定义

Python Matplotlib，从三个不等长的数组创建堆叠直方图

相关推荐

最近更新

标签