Python 请求库重定向新 url

Question

提问by Daniel Pilch

I've been looking through the Python Requests documentation but I cannot see any functionality for what I am trying to achieve.

我一直在查看 Python Requests 文档，但看不到我想要实现的任何功能。

In my script I am setting allow_redirects=True.

在我的脚本中，我正在设置allow_redirects=True.

I would like to know if the page has been redirected to something else, what is the new URL.

我想知道该页面是否已重定向到其他内容，新 URL 是什么。

For example, if the start URL was: www.google.com/redirect

例如，如果起始 URL 是： www.google.com/redirect

And the final URL is www.google.co.uk/redirected

最终的 URL 是 www.google.co.uk/redirected

How do I get that URL?

我如何获得该网址？

Answer 1

回答by Back2Basics

the documentation has this blurb https://requests.readthedocs.io/en/master/user/quickstart/#redirection-and-history

文档有这个简介https://requests.readthedocs.io/en/master/user/quickstart/#redirection-and-history

import requests

r = requests.get('http://www.github.com')
r.url
#returns https://www.github.com instead of the http page you asked for

Answer 2

回答by Martijn Pieters

You are looking for the request history.

您正在寻找请求历史记录。

The response.historyattribute is a list of responses that led to the final URL, which can be found in response.url.

该response.history属性是指向最终 URL 的响应列表，可以在response.url.

response = requests.get(someurl)
if response.history:
    print "Request was redirected"
    for resp in response.history:
        print resp.status_code, resp.url
    print "Final destination:"
    print response.status_code, response.url
else:
    print "Request was not redirected"

Demo:

演示：

>>> import requests
>>> response = requests.get('http://httpbin.org/redirect/3')
>>> response.history
(<Response [302]>, <Response [302]>, <Response [302]>)
>>> for resp in response.history:
...     print resp.status_code, resp.url
... 
302 http://httpbin.org/redirect/3
302 http://httpbin.org/redirect/2
302 http://httpbin.org/redirect/1
>>> print response.status_code, response.url
200 http://httpbin.org/get

Answer 3

回答by Geng Jiawen

I think requests.headinstead of requests.getwill be more safe to call when handling url redirect,check the github issue here:

我认为在处理 url 重定向时调用requests.head而不是requests.get会更安全，请在此处查看 github 问题：

r = requests.head(url, allow_redirects=True)
print(r.url)

Answer 4

回答by hwjp

This is answering a slightly different question, but since I got stuck on this myself, I hope it might be useful for someone else.

这是在回答一个稍微不同的问题，但由于我自己被困在这个问题上，我希望它对其他人有用。

If you want to use allow_redirects=Falseand get directly to the first redirect object, rather than following a chain of them, and you just want to get the redirect location directly out of the 302 response object, then r.urlwon't work. Instead, it's the "Location" header:

如果您想allow_redirects=False直接使用和获取第一个重定向对象，而不是遵循它们的链，并且您只想直接从 302 响应对象中获取重定向位置，那么r.url将不起作用。相反，它是“位置”标题：

r = requests.get('http://github.com/', allow_redirects=False)
r.status_code  # 302
r.url  # http://github.com, not https.
r.headers['Location']  # https://github.com/ -- the redirect destination

Answer 5

回答by Shuai.Z

For python3.5, you can use the following code:

对于python3.5，可以使用如下代码：

import urllib.request
res = urllib.request.urlopen(starturl)
finalurl = res.geturl()
print(finalurl)

Python 请求库重定向新 url

提问by Daniel Pilch

回答by Back2Basics

回答by Martijn Pieters

回答by Geng Jiawen

回答by hwjp

回答by Shuai.Z

相关推荐

最近更新

标签

Python 请求库重定向新 url

提问by Daniel Pilch

回答by Back2Basics

回答by Martijn Pieters

回答by Geng Jiawen

回答by hwjp

回答by Shuai.Z

相关推荐

在 Mac（IDE？）上编写 Python 脚本的最佳方式

Python 将 Pandas 数据帧直接转换为稀疏 Numpy 矩阵

Python 将 JSON 数组从 Django 视图返回到模板

Python 如何将熊猫数据框的索引转换为列？

相关推荐

最近更新

标签