Python 使用 urllib2.urlopen 时如何获取最终重定向 URL?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3556266/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 11:42:27  来源:igfitidea点击:

How can I get the final redirect URL when using urllib2.urlopen?

pythonurllib2

提问by Mridang Agarwalla

I'm using the urllib2.urlopenmethod to open a URL and fetch the markup of a webpage. Some of these sites redirect me using the 301/302 redirects. I would like to know the final URL that I've been redirected to. How can I get this?

我正在使用该urllib2.urlopen方法打开一个 URL 并获取网页的标记。其中一些站点使用 301/302 重定向来重定向我。我想知道我被重定向到的最终 URL。我怎样才能得到这个?

采纳答案by user151019

Call the .geturl()method of the file object returned. Per the urllib2docs:

调用.geturl()返回的文件对象的方法。根据urllib2文档

geturl()— return the URL of the resource retrieved, commonly used to determine if a redirect was followed

geturl()— 返回检索到的资源的 URL,通常用于确定是否遵循重定向

Example:

例子:

import urllib2
response = urllib2.urlopen('http://tinyurl.com/5b2su2')
response.geturl() # 'http://stackoverflow.com/'

回答by Michael

The return value of urllib2.urlopenhas a geturl()method which should return the actual (i.e. last redirect) url.

的返回值urllib2.urlopen有一个geturl()方法,该方法应该返回实际(即上次重定向)url。

回答by Bengt

You can use HttpLib2with follow_all_redirects = Trueand get the content-locationfrom the response headers. See my answer to 'httplib is not getting all the redirect codes'for an example.

您可以使用HttpLib2withfollow_all_redirects = Truecontent-location从响应标头中获取。例如,请参阅我对“httplib 未获取所有重定向代码”的回答

回答by kevin

e.g.: urllib2.urlopen('ORIGINAL LINK').geturl()

例如: urllib2.urlopen('ORIGINAL LINK').geturl()

urllib2.urlopen(urllib2.Request('ORIGINAL LINK')).geturl()

urllib2.urlopen(urllib2.Request('ORIGINAL LINK')).geturl()