Python 请求库重定向新 url

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/20475552/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 20:33:17  来源:igfitidea点击:

Python Requests library redirect new url

pythonhttpredirectpython-requests

提问by Daniel Pilch

I've been looking through the Python Requests documentation but I cannot see any functionality for what I am trying to achieve.

我一直在查看 Python Requests 文档,但看不到我想要实现的任何功能。

In my script I am setting allow_redirects=True.

在我的脚本中,我正在设置allow_redirects=True.

I would like to know if the page has been redirected to something else, what is the new URL.

我想知道该页面是否已重定向到其他内容,新 URL 是什么。

For example, if the start URL was: www.google.com/redirect

例如,如果起始 URL 是: www.google.com/redirect

And the final URL is www.google.co.uk/redirected

最终的 URL 是 www.google.co.uk/redirected

How do I get that URL?

我如何获得该网址?

回答by Back2Basics

the documentation has this blurb https://requests.readthedocs.io/en/master/user/quickstart/#redirection-and-history

文档有这个简介https://requests.readthedocs.io/en/master/user/quickstart/#redirection-and-history

import requests

r = requests.get('http://www.github.com')
r.url
#returns https://www.github.com instead of the http page you asked for 

回答by Martijn Pieters

You are looking for the request history.

您正在寻找请求历史记录

The response.historyattribute is a list of responses that led to the final URL, which can be found in response.url.

response.history属性是指向最终 URL 的响应列表,可以在response.url.

response = requests.get(someurl)
if response.history:
    print "Request was redirected"
    for resp in response.history:
        print resp.status_code, resp.url
    print "Final destination:"
    print response.status_code, response.url
else:
    print "Request was not redirected"

Demo:

演示:

>>> import requests
>>> response = requests.get('http://httpbin.org/redirect/3')
>>> response.history
(<Response [302]>, <Response [302]>, <Response [302]>)
>>> for resp in response.history:
...     print resp.status_code, resp.url
... 
302 http://httpbin.org/redirect/3
302 http://httpbin.org/redirect/2
302 http://httpbin.org/redirect/1
>>> print response.status_code, response.url
200 http://httpbin.org/get

回答by Geng Jiawen

I think requests.headinstead of requests.getwill be more safe to call when handling url redirect,check the github issue here:

我认为在处理 url 重定向时调用requests.head而不是requests.get会更安全,请在此处查看 github 问题:

r = requests.head(url, allow_redirects=True)
print(r.url)

回答by hwjp

This is answering a slightly different question, but since I got stuck on this myself, I hope it might be useful for someone else.

这是在回答一个稍微不同的问题,但由于我自己被困在这个问题上,我希望它对其他人有用。

If you want to use allow_redirects=Falseand get directly to the first redirect object, rather than following a chain of them, and you just want to get the redirect location directly out of the 302 response object, then r.urlwon't work. Instead, it's the "Location" header:

如果您想allow_redirects=False直接使用和获取第一个重定向对象,而不是遵循它们的链,并且您只想直接从 302 响应对象中获取重定向位置,那么r.url将不起作用。相反,它是“位置”标题:

r = requests.get('http://github.com/', allow_redirects=False)
r.status_code  # 302
r.url  # http://github.com, not https.
r.headers['Location']  # https://github.com/ -- the redirect destination

回答by Shuai.Z

For python3.5, you can use the following code:

对于python3.5,可以使用如下代码:

import urllib.request
res = urllib.request.urlopen(starturl)
finalurl = res.geturl()
print(finalurl)