如何从 php curl 获取 cookie 到一个变量中

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/895786/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-25 00:15:12  来源:igfitidea点击:

how to get the cookies from a php curl into a variable

phpcookiescurl

提问by thirsty93

So some guy at some other company thought it would be awesome if instead of using soap or xml-rpc or rest or any other reasonable communication protocol he just embedded all of his response as cookies in the header.

因此,其他公司的一些人认为,如果他不使用soap 或xml-rpc 或rest 或任何其他合理的通信协议,而是将他的所有响应作为cookie 嵌入到标头中,那将会很棒。

I need to pull these cookies out as hopefully an array from this curl response. If I have to waste a bunch of my life writing a parser for this I will be very unhappy.

我需要从这个 curl 响应中取出这些 cookie 作为一个数组。如果我不得不为此浪费大量生命来编写解析器,我会非常不高兴。

Does anyone know how this can simply be done, preferably without writing anything to a file?

有谁知道如何简单地做到这一点,最好不要向文件写入任何内容?

I will be very grateful if anyone can help me out with this.

如果有人能帮我解决这个问题,我将不胜感激。

回答by TML

$ch = curl_init('http://www.google.com/');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
// get headers too with this line
curl_setopt($ch, CURLOPT_HEADER, 1);
$result = curl_exec($ch);
// get cookie
// multi-cookie variant contributed by @Combuster in comments
preg_match_all('/^Set-Cookie:\s*([^;]*)/mi', $result, $matches);
$cookies = array();
foreach($matches[1] as $item) {
    parse_str($item, $cookie);
    $cookies = array_merge($cookies, $cookie);
}
var_dump($cookies);

回答by Googol

Although this question is quite old, and the accepted response is valid, I find it a bit unconfortable because the content of the HTTP response (HTML, XML, JSON, binary or whatever) becomes mixed with the headers.

虽然这个问题已经很老了,并且接受的响应是有效的,但我发现它有点不舒服,因为 HTTP 响应的内容(HTML、XML、JSON、二进制或其他)与标头混合在一起。

I've found a different alternative. CURL provides an option (CURLOPT_HEADERFUNCTION) to set a callback that will be called for each response header line. The function will receive the curl object and a string with the header line.

我找到了一个不同的选择。CURL 提供了一个选项 ( CURLOPT_HEADERFUNCTION) 来设置将为每个响应标题行调用的回调。该函数将接收 curl 对象和带有标题行的字符串。

You can use a code like this (adapted from TML response):

您可以使用这样的代码(改编自 TML 响应):

$cookies = Array();
$ch = curl_init('http://www.google.com/');
// Ask for the callback.
curl_setopt($ch, CURLOPT_HEADERFUNCTION, "curlResponseHeaderCallback");
$result = curl_exec($ch);
var_dump($cookies);

function curlResponseHeaderCallback($ch, $headerLine) {
    global $cookies;
    if (preg_match('/^Set-Cookie:\s*([^;]*)/mi', $headerLine, $cookie) == 1)
        $cookies[] = $cookie;
    return strlen($headerLine); // Needed by curl
}

This solution has the drawback of using a global variable, but I guess this is not an issue for short scripts. And you can always use static methods and attributes if curl is being wrapped into a class.

这个解决方案有使用全局变量的缺点,但我想这对于短脚本来说不是问题。如果 curl 被包装到一个类中,你总是可以使用静态方法和属性。

回答by Alex P

This does it without regexps, but requires the PECL HTTP extension.

这不需要正则表达式,但需要PECL HTTP 扩展

curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
$result = curl_exec($ch);
curl_close($ch);

$headers = http_parse_headers($result);
$cookobjs = Array();
foreach($headers AS $k => $v){
    if (strtolower($k)=="set-cookie"){
        foreach($v AS $k2 => $v2){
            $cookobjs[] = http_parse_cookie($v2);
        }
    }
}

$cookies = Array();
foreach($cookobjs AS $row){
    $cookies[] = $row->cookies;
}

$tmp = Array();
// sort k=>v format
foreach($cookies AS $v){
    foreach ($v  AS $k1 => $v1){
        $tmp[$k1]=$v1;
    }
}

$cookies = $tmp;
print_r($cookies);

回答by SoapBox

If you use CURLOPT_COOKIE_FILE and CURLOPT_COOKIE_JAR curl will read/write the cookies from/to a file. You can, after curl is done with it, read and/or modify it however you want.

如果您使用 CURLOPT_COOKIE_FILE 和 CURLOPT_COOKIE_JAR curl 将从/向文件读取/写入cookie。在 curl 完成后,您可以根据需要阅读和/或修改它。

回答by Daniel Stenberg

libcurl also provides CURLOPT_COOKIELIST which extracts all known cookies. All you need is to make sure the PHP/CURL binding can use it.

libcurl 还提供了 CURLOPT_COOKIELIST 来提取所有已知的 cookie。您只需要确保 PHP/CURL 绑定可以使用它。

回答by hanshenrik

someone here may find it useful. hhb_curl_exec2 works pretty much like curl_exec, but arg3 is an array which will be populated with the returned http headers (numeric index), and arg4 is an array which will be populated with the returned cookies ($cookies["expires"]=>"Fri, 06-May-2016 05:58:51 GMT"), and arg5 will be populated with... info about the raw request made by curl.

这里有人可能会发现它很有用。hhb_curl_exec2 的工作方式与 curl_exec 非常相似,但 arg3 是一个数组,它将填充返回的 http 标头(数字索引),而 arg4 是一个数组,它将填充返回的 cookie ($cookies["expires"]=>" Fri, 06-May-2016 05:58:51 GMT"),并且 arg5 将填充...有关 curl 发出的原始请求的信息。

the downside is that it requires CURLOPT_RETURNTRANSFER to be on, else it error out, and that it will overwrite CURLOPT_STDERR andCURLOPT_VERBOSE, if you were already using them for something else.. (i might fix this later)

缺点是它需要 CURLOPT_RETURNTRANSFER 打开,否则它会出错,并且它会覆盖 CURLOPT_STDERRCURLOPT_VERBOSE,如果你已经将它们用于其他用途..(我可能稍后会解决这个问题)

example of how to use it:

如何使用它的示例:

<?php
header("content-type: text/plain;charset=utf8");
$ch=curl_init();
$headers=array();
$cookies=array();
$debuginfo="";
$body="";
curl_setopt($ch,CURLOPT_SSL_VERIFYPEER,false);
curl_setopt($ch,CURLOPT_RETURNTRANSFER,true);
$body=hhb_curl_exec2($ch,'https://www.youtube.com/',$headers,$cookies,$debuginfo);
var_dump('$cookies:',$cookies,'$headers:',$headers,'$debuginfo:',$debuginfo,'$body:',$body);

and the function itself..

和功能本身..

function hhb_curl_exec2($ch, $url, &$returnHeaders = array(), &$returnCookies = array(), &$verboseDebugInfo = "")
{
    $returnHeaders    = array();
    $returnCookies    = array();
    $verboseDebugInfo = "";
    if (!is_resource($ch) || get_resource_type($ch) !== 'curl') {
        throw new InvalidArgumentException('$ch must be a curl handle!');
    }
    if (!is_string($url)) {
        throw new InvalidArgumentException('$url must be a string!');
    }
    $verbosefileh = tmpfile();
    $verbosefile  = stream_get_meta_data($verbosefileh);
    $verbosefile  = $verbosefile['uri'];
    curl_setopt($ch, CURLOPT_VERBOSE, 1);
    curl_setopt($ch, CURLOPT_STDERR, $verbosefileh);
    curl_setopt($ch, CURLOPT_HEADER, 1);
    $html             = hhb_curl_exec($ch, $url);
    $verboseDebugInfo = file_get_contents($verbosefile);
    curl_setopt($ch, CURLOPT_STDERR, NULL);
    fclose($verbosefileh);
    unset($verbosefile, $verbosefileh);
    $headers       = array();
    $crlf          = "\x0d\x0a";
    $thepos        = strpos($html, $crlf . $crlf, 0);
    $headersString = substr($html, 0, $thepos);
    $headerArr     = explode($crlf, $headersString);
    $returnHeaders = $headerArr;
    unset($headersString, $headerArr);
    $htmlBody = substr($html, $thepos + 4); //should work on utf8/ascii headers... utf32? not so sure..
    unset($html);
    //I REALLY HOPE THERE EXIST A BETTER WAY TO GET COOKIES.. good grief this looks ugly..
    //at least it's tested and seems to work perfectly...
    $grabCookieName = function($str)
    {
        $ret = "";
        $i   = 0;
        for ($i = 0; $i < strlen($str); ++$i) {
            if ($str[$i] === ' ') {
                continue;
            }
            if ($str[$i] === '=') {
                break;
            }
            $ret .= $str[$i];
        }
        return urldecode($ret);
    };
    foreach ($returnHeaders as $header) {
        //Set-Cookie: crlfcoookielol=crlf+is%0D%0A+and+newline+is+%0D%0A+and+semicolon+is%3B+and+not+sure+what+else
        /*Set-Cookie:ci_spill=a%3A4%3A%7Bs%3A10%3A%22session_id%22%3Bs%3A32%3A%22305d3d67b8016ca9661c3b032d4319df%22%3Bs%3A10%3A%22ip_address%22%3Bs%3A14%3A%2285.164.158.128%22%3Bs%3A10%3A%22user_agent%22%3Bs%3A109%3A%22Mozilla%2F5.0+%28Windows+NT+6.1%3B+WOW64%29+AppleWebKit%2F537.36+%28KHTML%2C+like+Gecko%29+Chrome%2F43.0.2357.132+Safari%2F537.36%22%3Bs%3A13%3A%22last_activity%22%3Bi%3A1436874639%3B%7Dcab1dd09f4eca466660e8a767856d013; expires=Tue, 14-Jul-2015 13:50:39 GMT; path=/
        Set-Cookie: sessionToken=abc123; Expires=Wed, 09 Jun 2021 10:18:14 GMT;
        //Cookie names cannot contain any of the following '=,; \t\r\n34'
        //
        */
        if (stripos($header, "Set-Cookie:") !== 0) {
            continue;
            /**/
        }
        $header = trim(substr($header, strlen("Set-Cookie:")));
        while (strlen($header) > 0) {
            $cookiename                 = $grabCookieName($header);
            $returnCookies[$cookiename] = '';
            $header                     = substr($header, strlen($cookiename) + 1); //also remove the = 
            if (strlen($header) < 1) {
                break;
            }
            ;
            $thepos = strpos($header, ';');
            if ($thepos === false) { //last cookie in this Set-Cookie.
                $returnCookies[$cookiename] = urldecode($header);
                break;
            }
            $returnCookies[$cookiename] = urldecode(substr($header, 0, $thepos));
            $header                     = trim(substr($header, $thepos + 1)); //also remove the ;
        }
    }
    unset($header, $cookiename, $thepos);
    return $htmlBody;
}

function hhb_curl_exec($ch, $url)
{
    static $hhb_curl_domainCache = "";
    //$hhb_curl_domainCache=&$this->hhb_curl_domainCache;
    //$ch=&$this->curlh;
    if (!is_resource($ch) || get_resource_type($ch) !== 'curl') {
        throw new InvalidArgumentException('$ch must be a curl handle!');
    }
    if (!is_string($url)) {
        throw new InvalidArgumentException('$url must be a string!');
    }

    $tmpvar = "";
    if (parse_url($url, PHP_URL_HOST) === null) {
        if (substr($url, 0, 1) !== '/') {
            $url = $hhb_curl_domainCache . '/' . $url;
        } else {
            $url = $hhb_curl_domainCache . $url;
        }
    }
    ;

    curl_setopt($ch, CURLOPT_URL, $url);
    $html = curl_exec($ch);
    if (curl_errno($ch)) {
        throw new Exception('Curl error (curl_errno=' . curl_errno($ch) . ') on url ' . var_export($url, true) . ': ' . curl_error($ch));
        // echo 'Curl error: ' . curl_error($ch);
    }
    if ($html === '' && 203 != ($tmpvar = curl_getinfo($ch, CURLINFO_HTTP_CODE)) /*203 is "success, but no output"..*/ ) {
        throw new Exception('Curl returned nothing for ' . var_export($url, true) . ' but HTTP_RESPONSE_CODE was ' . var_export($tmpvar, true));
    }
    ;
    //remember that curl (usually) auto-follows the "Location: " http redirects..
    $hhb_curl_domainCache = parse_url(curl_getinfo($ch, CURLINFO_EFFECTIVE_URL), PHP_URL_HOST);
    return $html;
}

回答by Rich Wandell

The accepted answer seems like it will search through the entire response message. This could give you false matches for cookie headers if the word "Set-Cookie" is at the beginning of a line. While it should be fine in most cases. The safer way might be to read through the message from the beginning until the first empty line which indicates the end of the message headers. This is just an alternate solution that should look for the first blank line and then use preg_grep on those lines only to find "Set-Cookie".

接受的答案似乎会搜索整个响应消息。如果单词“Set-Cookie”在一行的开头,这可能会给您提供 cookie 标头的错误匹配。虽然在大多数情况下应该没问题。更安全的方法可能是从头开始通读消息,直到第一个空行(指示消息头的结尾)。这只是一个替代解决方案,它应该查找第一个空行,然后在这些行上使用 preg_grep 来查找“Set-Cookie”。

    curl_setopt($ch, CURLOPT_HEADER, 1);
    //Return everything
    $res = curl_exec($ch);
    //Split into lines
    $lines = explode("\n", $res);
    $headers = array();
    $body = "";
    foreach($lines as $num => $line){
        $l = str_replace("\r", "", $line);
        //Empty line indicates the start of the message body and end of headers
        if(trim($l) == ""){
            $headers = array_slice($lines, 0, $num);
            $body = $lines[$num + 1];
            //Pull only cookies out of the headers
            $cookies = preg_grep('/^Set-Cookie:/', $headers);
            break;
        }
    }

回答by kyle

My understanding is that cookies from curlmust be written out to a file (curl -c cookie_file). If you're running curlthrough PHP's execor systemfunctions (or anything in that family), you should be able to save the cookies to a file, then open the file and read them in.

我的理解是curl必须将来自的 cookie写入文件 ( curl -c cookie_file)。如果您正在运行curlPHPexecsystem函数(或该系列中的任何东西),您应该能够将 cookie 保存到一个文件中,然后打开该文件并读取它们。