bash 如何使用 Curl 检索真正的重定向位置标头?不使用 {redirect_url}
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/46507336/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to retrieve the real redirect location header with Curl? without using {redirect_url}
提问by pancho
I realized that Curl {redirect_url} does not always show the same redirect URL. For example if the URL header isLocation: https:/\example.com
this will redirect to https:/\example.com
but curl {redirect_url} shows redirect_url: https://host-domain.com/https:/\example.com
and it won't display the response real location header. (I like to see the real location:
result.)
我意识到 Curl {redirect_url} 并不总是显示相同的重定向 URL。例如,如果 URL 标头是Location: https:/\example.com
这将重定向到https:/\example.com
但 curl {redirect_url} 显示redirect_url: https://host-domain.com/https:/\example.com
,它不会显示响应真实位置标头。(我喜欢看到真实的location:
结果。)
This is the BASH I'm working with:
这是我正在使用的 BASH:
#!/bin/bash
# Usage: urls-checker.sh domains.txt
FILE=""
while read -r LINE; do
# read the response to a variable
response=$(curl -H 'Cache-Control: no-cache' -s -k --max-time 2 --write-out '%{http_code} %{size_header} %{redirect_url} ' "$LINE")
# get the title
title=$(sed -n 's/.*<title>\(.*\)<\/title>.*//ip;T;q'<<<"$response")
# read the write-out from the last line
read -r http_code size_header redirect_url < <(tail -n 1 <<<"$response")
printf "***Url: %s\n\n" "$LINE"
printf "Status: %s\n\n" "$http_code"
printf "Size: %s\n\n" "$size_header"
printf "Redirect-url: %s\n\n" "$redirect_url"
printf "Title: %s\n\n" "$title"
# -c 20 only shows the 20 first chars from response
printf "Body: %s\n\n" "$(head -c 100 <<<"$response")"
done < "${FILE}"
How can I printf "Redirect-url:
the original requested location: header
without having to use redirect_url
?
我怎样才能printf "Redirect-url:
原来要求location: header
而不必使用redirect_url
?
采纳答案by randomir
To read the exact Location
header field value, as returned by the server, you can use the -i
/--include
option, in combination with grep
.
要读取Location
服务器返回的确切标头字段值,您可以将-i
/--include
选项与grep
.
For example:
例如:
$ curl 'http://httpbin.org/redirect-to?url=http:/\example.com' -si | grep -oP 'Location: \K.*'
http:/\example.com
Or, if you want to read all headers, contentand the --write-out
variablesline (according to your script):
或者,如果您想读取所有headers、content和--write-out
variables行(根据您的脚本):
response=$(curl -H 'Cache-Control: no-cache' -s -i -k --max-time 2 --write-out '%{http_code} %{size_header} %{redirect_url} ' "$url")
# break the response in parts
headers=$(sed -n '1,/^\r$/p' <<<"$response")
content=$(sed -e '1,/^\r$/d' -e '$d' <<<"$response")
read -r http_code size_header redirect_url < <(tail -n1 <<<"$response")
# get the real Location
location=$(grep -oP 'Location: \K.*' <<<"$headers")
Fully integrated in your script, this looks like:
完全集成到您的脚本中,如下所示:
#!/bin/bash
# Usage: urls-checker.sh domains.txt
file=""
while read -r url; do
# read the response to a variable
response=$(curl -H 'Cache-Control: no-cache' -s -i -k --max-time 2 --write-out '%{http_code} %{size_header} %{redirect_url} ' "$url")
# break the response in parts
headers=$(sed -n '1,/^\r$/p' <<<"$response")
content=$(sed -e '1,/^\r$/d' -e '$d' <<<"$response")
read -r http_code size_header redirect_url < <(tail -n1 <<<"$response")
# get the real Location
location=$(grep -oP 'Location: \K.*' <<<"$headers")
# get the title
title=$(sed -n 's/.*<title>\(.*\)<\/title>.*//ip;T;q'<<<"$content")
printf "***Url: %s\n\n" "$url"
printf "Status: %s\n\n" "$http_code"
printf "Size: %s\n\n" "$size_header"
printf "Redirect-url: %s\n\n" "$location"
printf "Title: %s\n\n" "$title"
printf "Body: %s\n\n" "$(head -c 100 <<<"$content")"
done < "$file"
回答by Salem
According to @randomir answer and since I was only need raw redirect URL I use this command on my batch
根据@randomir 的回答,由于我只需要原始重定向 URL,因此我在批处理中使用此命令
curl -w "%{redirect_url}" -o /dev/null -s "https://stackoverflow.com/q/46507336/3019002"
回答by Daniel Stenberg
https:/\example.com
is not a legal URL(*). The fact that this works in browsers in an abomination (that I've fought against) and curl doesn't. %{redirect_url}
shows exactly the URL curl would redirect to...
https:/\example.com
不是合法的 URL(*)。事实上,这在令人憎恶的浏览器中起作用(我曾与之抗争过)而 curl 却没有。%{redirect_url}
准确显示 URL curl 将重定向到...
A URL should use to forward slashes, so the above should look like http://example.com
.
URL 应该用于正斜杠,因此上面的内容应该类似于http://example.com
.
(*) = I refuse to accept the WHATWG "definition".
(*) = 我拒绝接受 WHATWG 的“定义”。