php 在 Google 中使用 CURL

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2968866/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-25 08:16:58  来源:igfitidea点击:

Using CURL with Google

phpcurl

提问by TheBounder

I want to CURL to Google to see how many results it returns for a certain search.

我想 CURL 到 Google 以查看它为某个搜索返回了多少结果。

I've tried this:

我试过这个:

  $url = "http://www.google.com/search?q=".$strSearch."&hl=en&start=0&sa=N";
  $ch = curl_init();
  curl_setopt($ch, CURLOPT_HEADER, 0);
  curl_setopt($ch, CURLOPT_VERBOSE, 0);
  curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
  curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible;)");
  curl_setopt($ch, CURLOPT_URL, $url);
  curl_setopt($ch, CURLOPT_POST, true);
  $response = curl_exec($ch);
  curl_close($ch);

But it just returns a 405 Method Allowed google error.

但它只返回 405 Method Allowed google 错误。

Any ideas?

有任何想法吗?

Thanks

谢谢

回答by Matti Virkkunen

Use a GET request instead of a POST request. That is, get rid of

使用 GET 请求而不是 POST 请求。也就是说,摆脱

curl_setopt($ch, CURLOPT_POST, true);

Or even better, use their well defined search APIinstead of screen-scraping.

或者更好的是,使用他们定义良好的搜索 API而不是屏幕抓取。

回答by Justin Johnson

Scrapping Google is a very easy thing to do. However, if you don't require more than the first 30 results, then the search APIis preferable (as others have suggested). Otherwise, here's some sample code. I've ripped this out of a couple of classes that I'm using so it might not be totally functional as is, but you should get the idea.

报废 Google 是一件非常容易的事情。但是,如果您需要的结果不超过前 30 个,则最好使用搜索 API(正如其他人所建议的那样)。否则,这里有一些示例代码。我已经从我正在使用的几个类中删除了它,因此它可能无法完全正常工作,但您应该明白这个想法。

function queryToUrl($query, $start=null, $perPage=100, $country="US") {
    return "http://www.google.com/search?" . $this->_helpers->url->buildQuery(array(
        // Query
        "q"     => urlencode($query),
        // Country (geolocation presumably)
        "gl"    => $country,
        // Start offset
        "start" => $start,
        // Number of result to a page
        "num"   => $perPage
    ), true);
}

// Find first 100 result for "pizza" in Canada
$ch = curl_init(queryToUrl("pizza", 0, 100, "CA"));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_USERAGENT,      $this->getUserAgent(/*$proxyIp*/));
curl_setopt($ch, CURLOPT_MAXREDIRS,      4);
curl_setopt($ch, CURLOPT_TIMEOUT,        5);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);

$response = curl_exec($ch);

Note: $this->_helpers->url->buildQuery()is identical to http_build_queryexcept that it will drop empty parameters.

注意:除了它会删除空参数之外,$this->_helpers->url->buildQuery()http_build_query其他相同。

回答by Gelatin

Use the Google Ajax API.

使用 Google Ajax API。

http://code.google.com/apis/ajaxsearch/

http://code.google.com/apis/ajaxsearch/

See this threadfor how to get the number of results. While it refers to c# libraries, it might give you some pointers.

有关如何获取结果数量的信息,请参阅此线程。虽然它指的是 c# 库,但它可能会给您一些提示。

回答by Rinku

Before scrapping data please read https://support.google.com/websearch/answer/86640?rd=1

在抓取数据之前,请阅读https://support.google.com/websearch/answer/86640?rd=1

Against google terms

违反谷歌条款

Automated traffic includes:

自动流量包括:

Sending searches from a robot, computer program, automated service, or search scraper Using software that sends searches to Google to see how a website or webpage ranks on Google

从机器人、计算机程序、自动化服务或搜索抓取工具发送搜索 使用将搜索发送到 Google 的软件以查看网站或网页在 Google 上的排名

回答by Jet

CURLOPT_CUSTOMREQUEST => ($post)? "POST" : "GET"

CURLOPT_CUSTOMREQUEST => ($post)?"POST" : "获取"