php 如何使用 cURL 获取目标 URL?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1439040/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How can I get the destination URL using cURL?
提问by ahmed
How can I get the destination URL using cURL when the HTTP status code is 302?
当 HTTP 状态代码为 302 时,如何使用 cURL 获取目标 URL?
<?PHP
$url = "http://www.ecs.soton.ac.uk/news/";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$html = curl_exec($ch);
$status_code = curl_getinfo($ch,CURLINFO_HTTP_CODE);
if($status_code=302 or $status_code=301){
$url = "";
// I want to to get the destination url
}
curl_close($ch);
?>
回答by Tamik Soziev
You can use:
您可以使用:
echo curl_getinfo($ch, CURLINFO_EFFECTIVE_URL);
回答by Leksat
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($ch, CURLOPT_HEADER, TRUE); // We'll parse redirect url from header.
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, FALSE); // We want to just get redirect url but not to follow it.
$response = curl_exec($ch);
preg_match_all('/^Location:(.*)$/mi', $response, $matches);
curl_close($ch);
echo !empty($matches[1]) ? trim($matches[1][0]) : 'No redirect found';
回答by Shawn
A bit dated of a response but wanted to show a full working example, some of the solutions out there are pieces:
回复有点过时,但想展示一个完整的工作示例,其中一些解决方案是:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url); //set url
curl_setopt($ch, CURLOPT_HEADER, true); //get header
curl_setopt($ch, CURLOPT_NOBODY, true); //do not include response body
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); //do not show in browser the response
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); //follow any redirects
curl_exec($ch);
$new_url = curl_getinfo($ch, CURLINFO_EFFECTIVE_URL); //extract the url from the header response
curl_close($ch);
This works with any redirects such as 301 or 302, however on 404's it will just return the original url requested (since it wasn't found). This can be used to update or remove links from your site. This was my need anyway.
这适用于任何重定向,例如 301 或 302,但是在 404 上,它只会返回请求的原始 url(因为找不到它)。这可用于更新或删除您网站上的链接。无论如何,这是我的需要。
回答by raspi
You have to grab the Locationheader for the redirected URL.
您必须获取重定向 URL的Location标头。
回答by Arthur
In response to user437797's comment on Tamik Soziev's answer (I unfortunately do not have the reputation to comment there directly) :
回应 user437797 对 Tamik Soziev 的回答的评论(不幸的是,我没有直接在那里发表评论的声誉):
The CURLINFO_EFFECTIVE_URL works fine, but for it to do as op wants you also have to set CURLOPT_FOLLOWLOCATION to TRUE of course. This is because CURLINFO_EFFECTIVE_URL returns exactly what it says, the effective url that ends up getting loaded. If you don't follow redirects then this will be your requested url, if you do follow redirects then it will be the final url that is redirected to.
CURLINFO_EFFECTIVE_URL 工作正常,但要按照操作的要求执行,您当然还必须将 CURLOPT_FOLLOWLOCATION 设置为 TRUE。这是因为 CURLINFO_EFFECTIVE_URL 准确地返回它所说的内容,即最终加载的有效 url。如果您不遵循重定向,那么这将是您请求的 url,如果您遵循重定向,那么它将是重定向到的最终 url。
The nice thing about this approach is that it also works with multiple redirects, whereas when retrieving and parsing the HTTP header yourself you may have to do that multiple times before the final destination url is exposed.
这种方法的好处在于它也适用于多个重定向,而当您自己检索和解析 HTTP 标头时,您可能必须在最终目标 url 公开之前多次执行此操作。
Also note that the max number of redirects that curl follows can be controlled via CURLOPT_MAXREDIRS. By default it is unlimited (-1) but this may get you into trouble if someone (perhaps intentionally) configured and endless redirect loop for some url.
另请注意,curl 遵循的最大重定向数可以通过 CURLOPT_MAXREDIRS 进行控制。默认情况下,它是无限制的 (-1),但是如果有人(可能是有意地)为某些 url 配置了无限重定向循环,这可能会给您带来麻烦。
回答by echox
The new destination for a 302 redirect ist located in the http header field "location". Example:
302 重定向的新目的地位于 http 标头字段“位置”中。例子:
HTTP/1.1 302 Found
Date: Tue, 30 Jun 2002 1:20:30 GMT
Server: Apache
Location: http://www.foobar.com/foo/bar
Content-Type: text/html; charset=iso-8859-1
Just grep it with a regex.
只需使用正则表达式 grep 即可。
To include all HTTP header information include it to the result with the curl option CURLOPT_HEADER. Set it with:
要包含所有 HTTP 标头信息,请使用 curl 选项CURLOPT_HEADER将其包含到结果中。设置它:
curl_setopt($c, CURLOPT_HEADER, true);
If you simply want curl to follow the redirection use CURLOPT_FOLLOWLOCATION:
如果您只是想让 curl 遵循重定向,请使用CURLOPT_FOLLOWLOCATION:
curl_setopt($c, CURLOPT_FOLLOWLOCATION, true);
Anyway, you shouldn't use the new URI because HTTP Statuscode 302 is only a temporaryredirect.
无论如何,您不应该使用新的 URI,因为 HTTP 状态码 302 只是一个临时重定向。
回答by GZipp
Here's a way to get all headers returned by a curl http request, as well as the status code and an array of header lines for each header.
这是一种获取 curl http 请求返回的所有标头以及每个标头的状态代码和标头行数组的方法。
$url = 'http://google.com';
$opts = array(CURLOPT_URL => $url,
CURLOPT_RETURNTRANSFER => true,
CURLOPT_HEADER => true,
CURLOPT_FOLLOWLOCATION => true);
$ch = curl_init();
curl_setopt_array($ch, $opts);
$return = curl_exec($ch);
curl_close($ch);
$headers = http_response_headers($return);
foreach ($headers as $header) {
$str = http_response_code($header);
$hdr_arr = http_response_header_lines($header);
if (isset($hdr_arr['Location'])) {
$str .= ' - Location: ' . $hdr_arr['Location'];
}
echo $str . '<br />';
}
function http_response_headers($ret_str)
{
$hdrs = array();
$arr = explode("\r\n\r\n", $ret_str);
foreach ($arr as $each) {
if (substr($each, 0, 4) == 'HTTP') {
$hdrs[] = $each;
}
}
return $hdrs;
}
function http_response_header_lines($hdr_str)
{
$lines = explode("\n", $hdr_str);
$hdr_arr['status_line'] = trim(array_shift($lines));
foreach ($lines as $line) {
list($key, $val) = explode(':', $line, 2);
$hdr_arr[trim($key)] = trim($val);
}
return $hdr_arr;
}
function http_response_code($str)
{
return substr(trim(strstr($str, ' ')), 0, 3);
}
回答by Sabeen Malik
Use curl_getinfo($ch), and the first element (url) would indicate the effective URL.
使用curl_getinfo($ch),第一个元素 ( url) 将指示有效 URL。

