Linux 重定向 curl 后获取最终 URL
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3074288/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Get final URL after curl is redirected
提问by vise
I need to get the final URL after a page redirect preferably with curl or wget.
我需要在页面重定向后最好使用 curl 或 wget 获取最终 URL。
For example http://google.commay redirect to http://www.google.com.
例如http://google.com可能会重定向到http://www.google.com。
The contents are easy to get(ex. curl --max-redirs 10 http://google.com -L
), but I'm only interested in the final url (in the former case http://www.google.com).
内容很容易获得(例如curl --max-redirs 10 http://google.com -L
),但我只对最终 url 感兴趣(在前一种情况下http://www.google.com)。
Is there any way of doing this by using only Linux built-in tools? (command line only)
有没有办法只使用 Linux 内置工具来做到这一点?(仅限命令行)
采纳答案by Daniel Stenberg
curl
's -w
option and the sub variable url_effective
is what you are
looking for.
curl
的-w
选项和子变量url_effective
就是你要找的。
Something like
就像是
curl -Ls -o /dev/null -w %{url_effective} http://google.com
More info
更多信息
-L Follow redirects -s Silent mode. Don't output anything -o FILE Write output to <file> instead of stdout -w FORMAT What to output after completion
More
更多的
You might want to add -I
(that is an uppercase i
) as well, which will make the command not download any "body", but it then also uses the HEAD method, which is not what the question included and risk changing what the server does. Sometimes servers don't respond well to HEAD even when they respond fine to GET.
您可能还想添加-I
(即大写i
),这将使命令不下载任何“正文”,但它也会使用 HEAD 方法,这不是问题所包含的内容,并且有可能改变服务器的功能。有时,即使服务器对 GET 响应良好,服务器也不能很好地响应 HEAD。
回答by SpliFF
You could use grep. doesn't wget tell you where it's redirecting too? Just grep that out.
你可以用grep。wget 不会告诉你它也重定向到哪里了吗?只是grep出来。
回答by Gavin Mogan
I'm not sure how to do it with curl, but libwww-perl installs the GET alias.
我不确定如何使用 curl 来实现,但 libwww-perl 安装了 GET 别名。
$ GET -S -d -e http://google.com
GET http://google.com --> 301 Moved Permanently
GET http://www.google.com/ --> 302 Found
GET http://www.google.ca/ --> 200 OK
Cache-Control: private, max-age=0
Connection: close
Date: Sat, 19 Jun 2010 04:11:01 GMT
Server: gws
Content-Type: text/html; charset=ISO-8859-1
Expires: -1
Client-Date: Sat, 19 Jun 2010 04:11:01 GMT
Client-Peer: 74.125.155.105:80
Client-Response-Num: 1
Set-Cookie: PREF=ID=a1925ca9f8af11b9:TM=1276920661:LM=1276920661:S=ULFrHqOiFDDzDVFB; expires=Mon, 18-Jun-2012 04:11:01 GMT; path=/; domain=.google.ca
Title: Google
X-XSS-Protection: 1; mode=block
回答by Gavin Mogan
as another option:
作为另一种选择:
$ curl -i http://google.com
HTTP/1.1 301 Moved Permanently
Location: http://www.google.com/
Content-Type: text/html; charset=UTF-8
Date: Sat, 19 Jun 2010 04:15:10 GMT
Expires: Mon, 19 Jul 2010 04:15:10 GMT
Cache-Control: public, max-age=2592000
Server: gws
Content-Length: 219
X-XSS-Protection: 1; mode=block
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>301 Moved</TITLE></HEAD><BODY>
<H1>301 Moved</H1>
The document has moved
<A HREF="http://www.google.com/">here</A>.
</BODY></HTML>
But it doesn't go past the first one.
但它不会超过第一个。
回答by vise
Thank you. I ended up implementing your suggestions: curl -i + grep
谢谢你。我最终实施了您的建议:curl -i + grep
curl -i http://google.com -L | egrep -A 10 '301 Moved Permanently|302 Found' | grep 'Location' | awk -F': ' '{print }' | tail -1
Returns blank if the website doesn't redirect, but that's good enough for me as it works on consecutive redirections.
如果网站没有重定向,则返回空白,但这对我来说已经足够了,因为它适用于连续重定向。
Could be buggy, but at a glance it works ok.
可能有问题,但乍一看它工作正常。
回答by Jan Kori?ák
Thanks, that helped me. I made some improvements and wrapped that in a helper script "finalurl":
谢谢,这对我有帮助。我做了一些改进并将其包装在一个帮助脚本“finalurl”中:
#!/bin/bash
curl -s -L -I -o /dev/null -w '%{url_effective}'
-o
output to/dev/null
-I
don't actually download, just discover the final URL-s
silent mode, no progressbars
-o
输出到/dev/null
-I
不实际下载,只需找到最终 URL-s
静音模式,没有进度条
This made it possible to call the command from other scripts like this:
这使得从其他脚本调用命令成为可能,如下所示:
echo `finalurl http://someurl/`
回答by Ceagle
You can do this with wget usually. wget --content-disposition
"url" additionally if you add -O /dev/null
you will not be actually saving the file.
你通常可以用 wget 来做到这一点。 wget --content-disposition
“url”另外,如果您添加,-O /dev/null
您将不会实际保存文件。
wget -O /dev/null --content-disposition example.com
wget -O /dev/null --content-disposition example.com
回答by Mike Q
This would work:
这会起作用:
curl -I somesite.com | perl -n -e '/^Location: (.*)$/ && print "\n"'
回答by Geograph
The parameters -L (--location)
and -I (--head)
still doing unnecessary HEAD-request to the location-url.
参数-L (--location)
并-I (--head)
仍然对 location-url 进行不必要的 HEAD 请求。
If you are sure that you will have no more than one redirect, it is better to disable follow location and use a curl-variable %{redirect_url}.
如果您确定只有一个重定向,最好禁用跟随位置并使用卷曲变量 %{redirect_url}。
This code do only one HEAD-request to the specified URL and takes redirect_url from location-header:
此代码仅对指定的 URL 执行一个 HEAD 请求,并从位置标头中获取 redirect_url:
curl --head --silent --write-out "%{redirect_url}\n" --output /dev/null "https://""goo.gl/QeJeQ4"
Speed test
速度测试
all_videos_link.txt
- 50 links of goo.gl+bit.ly which redirect to youtube
all_videos_link.txt
- 50 个 goo.gl+bit.ly 链接重定向到 youtube
1. With follow location
1.跟随位置
time while read -r line; do
curl -kIsL -w "%{url_effective}\n" -o /dev/null $line
done < all_videos_link.txt
Results:
结果:
real 1m40.832s
user 0m9.266s
sys 0m15.375s
2. Without follow location
2.没有跟随位置
time while read -r line; do
curl -kIs -w "%{redirect_url}\n" -o /dev/null $line
done < all_videos_link.txt
Results:
结果:
real 0m51.037s
user 0m5.297s
sys 0m8.094s
回答by lakshmikandan
Can you try with it?
你可以试试吗?
#!/bin/bash
LOCATION=`curl -I 'http://your-domain.com/url/redirect?r=something&a=values-VALUES_FILES&e=zip' | perl -n -e '/^Location: (.*)$/ && print "\n"'`
echo "$LOCATION"
Note: when you execute the command curl -I http://your-domain.comhave to use single quotes in the command like curl -I 'http://your-domain.com'
注意:当你执行命令 curl -I http://your-domain.com必须在命令中使用单引号,如 curl -I 'http://your-domain.com'