我想在 php 中卷曲谷歌搜索结果

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/9392818/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-26 06:42:42  来源:igfitidea点击:

I want to cURL google search result in php

phpcurl

提问by beginner

I tried following codes:

我尝试了以下代码:

$url = 'http://www.google.co.uk/#q='.$query.'&hl=en&prmd=imvns&source=lnt&tbs=ctr:countryUK%7CcountryGB&cr=countryUK%7CcountryGB&sa=X&psj=1&ei=m65DT_yUAcnG0QX46_yPDw&ved=0CEEQpwUoAQ&bav=on.2,or.r_gc.r_pw.r_cp.,cf.osb&fp=2e9b4f7fb1e75d0d&biw=1440&bih=799';

$ch = curl_init();

curl_setopt($ch, CURLOPT_PROXY, '192.168.0.1:1501');
curl_setopt($ch, CURLOPT_REFERER, 'www.google.com');
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL, $url);

$contents = curl_exec($ch);

curl_close($ch);

echo $contents;

But it shows google homepage instead of google search result page. Help me to resolve this problem.

但它显示谷歌主页而不是谷歌搜索结果页面。帮我解决这个问题。

回答by Sam Battat

I was successfully able to bypass google's attempt to prevent curl search by the following:

我通过以下方式成功绕过了谷歌阻止 curl 搜索的尝试:

$useragent = "Opera/9.80 (J2ME/MIDP; Opera Mini/4.2.14912/870; U; id) Presto/2.4.15";
$ch = curl_init ("");
curl_setopt ($ch, CURLOPT_URL, "http://www.google.com/search?hl=en&tbo=d&site=&source=hp&q=".$query);
curl_setopt ($ch, CURLOPT_USERAGENT, $useragent); // set user agent
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
echo $output = curl_exec ($ch);
curl_close($ch);

Note the user agent I used is an old opera mini browser. this way google displays an html content that you can parse.

请注意,我使用的用户代理是旧的 opera mini 浏览器。这样 google 就会显示一个您可以解析的 html 内容。

THIS IS AGAINST GOOGLE TOS, please do not abuse ;)

这违反了 GOOGLE TOS,请不要滥用 ;)

[EDIT] use $query = urlencode($query)

[编辑] 使用 $query = urlencode($query)

回答by Ben D

In this particular instance you this won't work because Google has specifically designed this URL to not be cURL-able. You'll notice (as Quentin has noted) that the url is using an anchor string rather than standard query string syntax (the variables should come after a ?but in this case they're coming after a #). Google has a piece of javascript that grabs the anchor string and then uses ajax to load content into the results frame. file_get_contentand cURLare therefore powerless to get the results from this URL.

在这种特殊情况下,这将不起作用,因为 Google 专门将此 URL 设计为不可 cURL 的。您会注意到(正如 Quentin 所指出的)该 url 使用的是锚字符串而不是标准查询字符串语法(变量应该在 a 之后,?但在这种情况下,它们在 a 之后#)。Google 有一段 javascript 可以抓取锚字符串,然后使用 ajax 将内容加载到结果框架中。file_get_contentcURL因此无力摆脱这个网址的结果。

There are other places where you can pass in proper query strings:

还有其他地方可以传递正确的查询字符串:

http://www.google.ca/search?q=query+filetype%3Apdf+site%3Ayour_domain.com&hl=en&num=10&lr=lang_en&ft=i&cr=&safe=images

http://www.google.ca/search?q=query+filetype%3Apdf+site%3Ayour_domain.com&hl=en&num=10&lr=lang_en&ft=i&cr=&safe=images

And it will get fetchable, but this almost certainly violates Google's TOR, so tread with caution. Also, there is a pay-for Google service that allows you to do this easily and without any pesky threat of a lawsuit.

它会变得可获取,但这几乎肯定会违反 Google 的 TOR,因此请谨慎行事。此外,还有一项付费 Google 服务,可让您轻松完成此操作,而不会受到任何令人讨厌的诉讼威胁。

回答by mishu

the other guys were right about warning you to check the TOS and about the fact that the anchor you are using in the url doesn't look right. But even if that anchor does not exist you still should get the main page. So the things that I think that might cause the problem:

其他人警告您检查 TOS 以及您在 url 中使用的锚点看起来不正确这一事实是正确的。但即使该锚点不存在,您仍然应该获得主页。所以我认为可能会导致问题的事情:

are you sure that the proxy you want to use works fine? run a test without this line:

您确定要使用的代理工作正常吗?在没有这一行的情况下运行测试:

curl_setopt($ch, CURLOPT_PROXY, '192.168.0.1:1501');

also, they might make some checks that involve the user agent and you are not providing any value, so consider adding a like like:

此外,他们可能会进行一些涉及用户代理的检查,而您没有提供任何价值,因此请考虑添加类似的内容:

curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.1) Gecko/20061204 Firefox/2.0.0.1');

回答by user3123357

Check Php Access in the following link.

在以下链接中检查 PHP Access。

https://developers.google.com/web-search/docs/

https://developers.google.com/web-search/docs/

$url = "https://ajax.googleapis.com/ajax/services/search/web?v=1.0&q=Paris%20Hilton&userip=USERS-IP-ADDRESS";

$ch = curl_init();

curl_setopt($ch, CURLOPT_URL, $url);

curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

curl_setopt($ch, CURLOPT_REFERER, /* Enter the URL of your site here */);

$body = curl_exec($ch);

curl_close($ch);

// now, process the JSON string

$json = json_decode($body);

$url = " https://ajax.googleapis.com/ajax/services/search/web?v=1.0&q=Paris%20Hilton&userip=USERS-IP-ADDRESS";

$ch = curl_init();

curl_setopt($ch, CURLOPT_URL, $url);

curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

curl_setopt($ch, CURLOPT_REFERER, /* 在此处输入您网站的 URL */);

$body = curl_exec($ch);

curl_close($ch);

// 现在,处理 JSON 字符串

$json = json_decode($body);