Linux 重定向 curl 后获取最终 URL

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3074288/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-03 20:08:55  来源:igfitidea点击:

Get final URL after curl is redirected

linuxredirectcurlwget

提问by vise

I need to get the final URL after a page redirect preferably with curl or wget.

我需要在页面重定向后最好使用 curl 或 wget 获取最终 URL。

For example http://google.commay redirect to http://www.google.com.

例如http://google.com可能会重定向到http://www.google.com

The contents are easy to get(ex. curl --max-redirs 10 http://google.com -L), but I'm only interested in the final url (in the former case http://www.google.com).

内容很容易获得(例如curl --max-redirs 10 http://google.com -L),但我只对最终 url 感兴趣(在前一种情况下http://www.google.com)。

Is there any way of doing this by using only Linux built-in tools? (command line only)

有没有办法只使用 Linux 内置工具来做到这一点?(仅限命令行)

采纳答案by Daniel Stenberg

curl's -woption and the sub variable url_effectiveis what you are looking for.

curl-w选项和子变量url_effective就是你要找的。

Something like

就像是

curl -Ls -o /dev/null -w %{url_effective} http://google.com

More info

更多信息

-L         Follow redirects
-s         Silent mode. Don't output anything
-o FILE    Write output to <file> instead of stdout
-w FORMAT  What to output after completion

More

更多的

You might want to add -I(that is an uppercase i) as well, which will make the command not download any "body", but it then also uses the HEAD method, which is not what the question included and risk changing what the server does. Sometimes servers don't respond well to HEAD even when they respond fine to GET.

您可能还想添加-I(即大写i),这将使命令不下载任何“正文”,但它也会使用 HEAD 方法,这不是问题所包含的内容,并且有可能改变服务器的功能。有时,即使服务器对 GET 响应良好,服务器也不能很好地响应 HEAD。

回答by SpliFF

You could use grep. doesn't wget tell you where it's redirecting too? Just grep that out.

你可以用grep。wget 不会告诉你它也重定向到哪里了吗?只是grep出来。

回答by Gavin Mogan

I'm not sure how to do it with curl, but libwww-perl installs the GET alias.

我不确定如何使用 curl 来实现,但 libwww-perl 安装了 GET 别名。

$ GET -S -d -e http://google.com
GET http://google.com --> 301 Moved Permanently
GET http://www.google.com/ --> 302 Found
GET http://www.google.ca/ --> 200 OK
Cache-Control: private, max-age=0
Connection: close
Date: Sat, 19 Jun 2010 04:11:01 GMT
Server: gws
Content-Type: text/html; charset=ISO-8859-1
Expires: -1
Client-Date: Sat, 19 Jun 2010 04:11:01 GMT
Client-Peer: 74.125.155.105:80
Client-Response-Num: 1
Set-Cookie: PREF=ID=a1925ca9f8af11b9:TM=1276920661:LM=1276920661:S=ULFrHqOiFDDzDVFB; expires=Mon, 18-Jun-2012 04:11:01 GMT; path=/; domain=.google.ca
Title: Google
X-XSS-Protection: 1; mode=block

回答by Gavin Mogan

as another option:

作为另一种选择:

$ curl -i http://google.com
HTTP/1.1 301 Moved Permanently
Location: http://www.google.com/
Content-Type: text/html; charset=UTF-8
Date: Sat, 19 Jun 2010 04:15:10 GMT
Expires: Mon, 19 Jul 2010 04:15:10 GMT
Cache-Control: public, max-age=2592000
Server: gws
Content-Length: 219
X-XSS-Protection: 1; mode=block

<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>301 Moved</TITLE></HEAD><BODY>
<H1>301 Moved</H1>
The document has moved
<A HREF="http://www.google.com/">here</A>.
</BODY></HTML>

But it doesn't go past the first one.

但它不会超过第一个。

回答by vise

Thank you. I ended up implementing your suggestions: curl -i + grep

谢谢你。我最终实施了您的建议:curl -i + grep

curl -i http://google.com -L | egrep -A 10 '301 Moved Permanently|302 Found' | grep 'Location' | awk -F': ' '{print }' | tail -1

Returns blank if the website doesn't redirect, but that's good enough for me as it works on consecutive redirections.

如果网站没有重定向,则返回空白,但这对我来说已经足够了,因为它适用于连续重定向。

Could be buggy, but at a glance it works ok.

可能有问题,但乍一看它工作正常。

回答by Jan Kori?ák

Thanks, that helped me. I made some improvements and wrapped that in a helper script "finalurl":

谢谢,这对我有帮助。我做了一些改进并将其包装在一个帮助脚本“finalurl”中:

#!/bin/bash
curl  -s -L -I -o /dev/null -w '%{url_effective}'
  • -ooutput to /dev/null
  • -Idon't actually download, just discover the final URL
  • -ssilent mode, no progressbars
  • -o输出到 /dev/null
  • -I不实际下载,只需找到最终 URL
  • -s静音模式,没有进度条

This made it possible to call the command from other scripts like this:

这使得从其他脚本调用命令成为可能,如下所示:

echo `finalurl http://someurl/`

回答by Ceagle

You can do this with wget usually. wget --content-disposition"url" additionally if you add -O /dev/nullyou will not be actually saving the file.

你通常可以用 wget 来做到这一点。 wget --content-disposition“url”另外,如果您添加,-O /dev/null您将不会实际保存文件。

wget -O /dev/null --content-disposition example.com

wget -O /dev/null --content-disposition example.com

回答by Mike Q

This would work:

这会起作用:

 curl -I somesite.com | perl -n -e '/^Location: (.*)$/ && print "\n"'

回答by Geograph

The parameters -L (--location)and -I (--head)still doing unnecessary HEAD-request to the location-url.

参数-L (--location)-I (--head)仍然对 location-url 进行不必要的 HEAD 请求。

If you are sure that you will have no more than one redirect, it is better to disable follow location and use a curl-variable %{redirect_url}.

如果您确定只有一个重定向,最好禁用跟随位置并使用卷曲变量 %{redirect_url}。

This code do only one HEAD-request to the specified URL and takes redirect_url from location-header:

此代码仅对指定的 URL 执行一个 HEAD 请求,并从位置标头中获取 redirect_url:

curl --head --silent --write-out "%{redirect_url}\n" --output /dev/null "https://""goo.gl/QeJeQ4"


Speed test

速度测试

all_videos_link.txt- 50 links of goo.gl+bit.ly which redirect to youtube

all_videos_link.txt- 50 个 goo.gl+bit.ly 链接重定向到 youtube

1. With follow location

1.跟随位置

time while read -r line; do
    curl -kIsL -w "%{url_effective}\n" -o /dev/null  $line
done < all_videos_link.txt

Results:

结果:

real    1m40.832s
user    0m9.266s
sys     0m15.375s

2. Without follow location

2.没有跟随位置

time while read -r line; do
    curl -kIs -w "%{redirect_url}\n" -o /dev/null  $line
done < all_videos_link.txt

Results:

结果:

real    0m51.037s
user    0m5.297s
sys     0m8.094s

回答by lakshmikandan

Can you try with it?

你可以试试吗?

#!/bin/bash 
LOCATION=`curl -I 'http://your-domain.com/url/redirect?r=something&a=values-VALUES_FILES&e=zip' | perl -n -e '/^Location: (.*)$/ && print "\n"'` 
echo "$LOCATION"

Note: when you execute the command curl -I http://your-domain.comhave to use single quotes in the command like curl -I 'http://your-domain.com'

注意:当你执行命令 curl -I http://your-domain.com必须在命令中使用单引号,如 curl -I 'http://your-domain.com'