Python 请求中的 URL 超过了最大重试次数

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/23013220/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 02:07:56  来源:igfitidea点击:

Max retries exceeded with URL in requests

pythonpython-requests

提问by user3446000

I'm trying to get the content of App Store > Business:

我正在尝试获取App Store > Business的内容:

import requests
from lxml import html

page = requests.get("https://itunes.apple.com/in/genre/ios-business/id6000?mt=8")
tree = html.fromstring(page.text)

flist = []
plist = []
for i in range(0, 100):
    app = tree.xpath("//div[@class='column first']/ul/li/a/@href")
    ap = app[0]
    page1 = requests.get(ap)

When I try the rangewith (0,2)it works, but when I put the rangein 100s it shows this error:

当我尝试range(0,2)它工作,但是当我把range100的IT显示了这个错误:

Traceback (most recent call last):
  File "/home/preetham/Desktop/eg.py", line 17, in <module>
    page1 = requests.get(ap)
  File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 55, in get
    return request('get', url, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 44, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 383, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 486, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/requests/adapters.py", line 378, in send
    raise ConnectionError(e)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='itunes.apple.com', port=443): Max retries exceeded with url: /in/app/adobe-reader/id469337564?mt=8 (Caused by <class 'socket.gaierror'>: [Errno -2] Name or service not known)

回答by djra

What happened here is that itunesserver refuses your connection (you're sending too many requests from same ip address in short period of time)

这里发生的事情是iTunes服务器拒绝您的连接(您在短时间内从同一 IP 地址发送了太多请求)

Max retries exceeded with url: /in/app/adobe-reader/id469337564?mt=8

url 超过最大重试次数:/in/app/adobe-reader/id469337564?mt=8

error trace is misleading it should be something like "No connection could be made because the target machine actively refused it".

错误跟踪具有误导性,它应该类似于“无法建立连接,因为目标机器主动拒绝它”

There is an issue at about python.requests lib at Github, check it out here

Github 上的关于 python.requests lib 有一个问题,请查看这里

To overcome this issue (not so much an issue as it is misleading debug trace) you should catch connection related exceptions like so:

要克服这个问题(与其说是一个问题,不如说是误导性调试跟踪),您应该捕获与连接相关的异常,如下所示:

try:
    page1 = requests.get(ap)
except requests.exceptions.ConnectionError:
    r.status_code = "Connection refused"

Another way to overcome this problem is if you use enough time gap to send requests to server this can be achieved by sleep(timeinsec)function in python (don't forget to import sleep)

解决此问题的另一种方法是,如果您使用足够的时间间隔向服务器发送请求,则可以通过sleep(timeinsec)python 中的函数来实现(不要忘记导入 sleep)

from time import sleep

All in all requests is awesome python lib, hope that solves your problem.

总而言之,请求是很棒的 python lib,希望能解决您的问题。

回答by jatin

Just do this,

就这样做,

Paste the following code in place of page = requests.get(url):

粘贴以下代码代替page = requests.get(url)

import time

page = ''
while page == '':
    try:
        page = requests.get(url)
        break
    except:
        print("Connection refused by the server..")
        print("Let me sleep for 5 seconds")
        print("ZZzzzz...")
        time.sleep(5)
        print("Was a nice sleep, now let me continue...")
        continue

You're welcome :)

别客气 :)

回答by Akshar

pip install pyopensslseemed to solve it for me.

pip install pyopenssl似乎为我解决了它。

https://github.com/requests/requests/issues/4246

https://github.com/requests/requests/issues/4246

回答by Zulu

Just use requests'features:

只需使用requests'功能:

import requests
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry


session = requests.Session()
retry = Retry(connect=3, backoff_factor=0.5)
adapter = HTTPAdapter(max_retries=retry)
session.mount('http://', adapter)
session.mount('https://', adapter)

session.get(url)

This will GETthe URL and retry 3 times in case of requests.exceptions.ConnectionError. backoff_factorwill help to apply delays between attempts to avoid to fail again in case of periodic request quota.

这将GETURL 并在 的情况下重试 3 次requests.exceptions.ConnectionErrorbackoff_factor将有助于在尝试之间应用延迟,以避免在定期请求配额的情况下再次失败。

Take a look at requests.packages.urllib3.util.retry.Retry, it has many options to simplify retries.

看看requests.packages.urllib3.util.retry.Retry,它有很多选项可以简化重试。

回答by Tanmoy Datta

It is always good to implement exception handling. It does not only help to avoid unexpected exit of script but can also help to log errors and info notification. When using Python requests I prefer to catch exceptions like this:

实现异常处理总是好的。它不仅有助于避免脚本意外退出,还有助于记录错误和信息通知。使用 Python 请求时,我更喜欢捕获这样的异常:

    try:
        res = requests.get(adress,timeout=30)
    except requests.ConnectionError as e:
        print("OOPS!! Connection Error. Make sure you are connected to Internet. Technical Details given below.\n")
        print(str(e))            
        renewIPadress()
        continue
    except requests.Timeout as e:
        print("OOPS!! Timeout Error")
        print(str(e))
        renewIPadress()
        continue
    except requests.RequestException as e:
        print("OOPS!! General Error")
        print(str(e))
        renewIPadress()
        continue
    except KeyboardInterrupt:
        print("Someone closed the program")

Here renewIPadress() is a user define function which can change the IP address if it get blocked. You can go without this function.

这里renewIPadress() 是一个用户定义的函数,如果它被阻止,它可以更改IP 地址。你可以不用这个功能。

回答by Raj Stha

I got similar problem but the following code worked for me.

我遇到了类似的问题,但以下代码对我有用。

url = <some REST url>    
page = requests.get(url, verify=False)

"verify=False" disables SSL verification. Try and catch can be added as usual.

“verify=False”禁用 SSL 验证。可以像往常一样添加 try 和 catch。

回答by Michael Yang

Add headers for this request.

为此请求添加标头。

headers={
'Referer': 'https://itunes.apple.com',
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.142 Safari/537.36'
}

requests.get(ap, headers=headers)

回答by Saleh

When I was writing a selenium browser test script, I encountered this error when calling driver.quit()before a usage of a JS api call.Remember that quiting webdriver is last thing to do!

我在写selenium浏览器测试脚本的时候,driver.quit()在使用JS api调用之前调用时遇到了这个错误。记住退出webdriver是最后一件事!

回答by Oded

Adding my own experience for those who are experiencing this in the future. My specific error was

为以后遇到这种情况的人添加我自己的经验。我的具体错误是

Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known'

It turns out that this was actually because I had reach the maximum number of open files on my system. It had nothing to do with failed connections, or even a DNS error as indicated.

事实证明,这实际上是因为我已达到系统上打开文件的最大数量。它与失败的连接无关,甚至与指示的 DNS 错误无关。

回答by alex

i wasn't able to make it work on windows even after installing pyopenssl and trying various python versions (while it worked fine on mac), so i switched to urllib and it works on python 3.6 (from python .org) and 3.7 (anaconda)

即使在安装 pyopenssl 并尝试了各种 python 版本之后,我也无法让它在 Windows 上运行(虽然它在 mac 上运行良好),所以我切换到 urllib 并且它在 python 3.6(来自 python .org)和 3.7(anaconda)上运行)

import urllib 
from urllib.request import urlopen
html = urlopen("http://pythonscraping.com/pages/page1.html")
contents = html.read()
print(contents)