Python 请求库重定向新 url
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/20475552/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python Requests library redirect new url
提问by Daniel Pilch
I've been looking through the Python Requests documentation but I cannot see any functionality for what I am trying to achieve.
我一直在查看 Python Requests 文档,但看不到我想要实现的任何功能。
In my script I am setting allow_redirects=True.
在我的脚本中,我正在设置allow_redirects=True.
I would like to know if the page has been redirected to something else, what is the new URL.
我想知道该页面是否已重定向到其他内容,新 URL 是什么。
For example, if the start URL was: www.google.com/redirect
例如,如果起始 URL 是: www.google.com/redirect
And the final URL is www.google.co.uk/redirected
最终的 URL 是 www.google.co.uk/redirected
How do I get that URL?
我如何获得该网址?
回答by Back2Basics
the documentation has this blurb https://requests.readthedocs.io/en/master/user/quickstart/#redirection-and-history
文档有这个简介https://requests.readthedocs.io/en/master/user/quickstart/#redirection-and-history
import requests
r = requests.get('http://www.github.com')
r.url
#returns https://www.github.com instead of the http page you asked for
回答by Martijn Pieters
You are looking for the request history.
您正在寻找请求历史记录。
The response.historyattribute is a list of responses that led to the final URL, which can be found in response.url.
该response.history属性是指向最终 URL 的响应列表,可以在response.url.
response = requests.get(someurl)
if response.history:
print "Request was redirected"
for resp in response.history:
print resp.status_code, resp.url
print "Final destination:"
print response.status_code, response.url
else:
print "Request was not redirected"
Demo:
演示:
>>> import requests
>>> response = requests.get('http://httpbin.org/redirect/3')
>>> response.history
(<Response [302]>, <Response [302]>, <Response [302]>)
>>> for resp in response.history:
... print resp.status_code, resp.url
...
302 http://httpbin.org/redirect/3
302 http://httpbin.org/redirect/2
302 http://httpbin.org/redirect/1
>>> print response.status_code, response.url
200 http://httpbin.org/get
回答by Geng Jiawen
回答by hwjp
This is answering a slightly different question, but since I got stuck on this myself, I hope it might be useful for someone else.
这是在回答一个稍微不同的问题,但由于我自己被困在这个问题上,我希望它对其他人有用。
If you want to use allow_redirects=Falseand get directly to the first redirect object, rather than following a chain of them, and you just want to get the redirect location directly out of the 302 response object, then r.urlwon't work. Instead, it's the "Location" header:
如果您想allow_redirects=False直接使用和获取第一个重定向对象,而不是遵循它们的链,并且您只想直接从 302 响应对象中获取重定向位置,那么r.url将不起作用。相反,它是“位置”标题:
r = requests.get('http://github.com/', allow_redirects=False)
r.status_code # 302
r.url # http://github.com, not https.
r.headers['Location'] # https://github.com/ -- the redirect destination
回答by Shuai.Z
For python3.5, you can use the following code:
对于python3.5,可以使用如下代码:
import urllib.request
res = urllib.request.urlopen(starturl)
finalurl = res.geturl()
print(finalurl)

