php 简单的 html dom file_get_html 不起作用 - 有什么解决方法吗?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/18667441/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Simple html dom file_get_html not working - is there any workaround?
提问by Altin
<?php
// Report all PHP errors (see changelog)
error_reporting(E_ALL);
include('inc/simple_html_dom.php');
//base url
$base = 'https://play.google.com/store/apps';
//home page HTML
$html_base = file_get_html( $base );
//get all category links
foreach($html_base->find('a') as $element) {
echo "<pre>";
print_r( $element->href );
echo "</pre>";
}
$html_base->clear();
unset($html_base);
?>
I have the above code and I'm trying to get certain elements of the Play Store page but it isn't returning anything. Is it possible that certain PHP functions might be disabled on the server to stop that?
我有上面的代码,我正在尝试获取 Play 商店页面的某些元素,但它没有返回任何内容。是否有可能在服务器上禁用某些 PHP 功能以阻止它?
The above code works perfectly on other sites.
上面的代码在其他网站上完美运行。
Is there any workaround?
有什么解决方法吗?
回答by Enissay
As I said, your example is working fine for me... But try this way using curl instead:
正如我所说,您的示例对我来说效果很好......但是请尝试使用 curl 这种方式:
//base url
$base = 'https://play.google.com/store/apps';
$curl = curl_init();
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($curl, CURLOPT_HEADER, false);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($curl, CURLOPT_URL, $base);
curl_setopt($curl, CURLOPT_REFERER, $base);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
$str = curl_exec($curl);
curl_close($curl);
// Create a DOM object
$html_base = new simple_html_dom();
// Load HTML from a string
$html_base->load($str);
//get all category links
foreach($html_base->find('a') as $element) {
echo "<pre>";
print_r( $element->href );
echo "</pre>";
}
$html_base->clear();
unset($html_base);
It gets all the links as expected:
它按预期获取所有链接:
And make sure you have php_openssl
and php_curl
installed...
并确保您已安装php_openssl
并php_curl
...
回答by Chitsai Yeh
remove the semicolon from php.ini and restart Apache server to enable php module configuration
去掉 php.ini 中的分号并重启 Apache 服务器以启用 php 模块配置
; Windows Extensions
...
;extension=php_openssl.dll
...
回答by shahil
You must set "allow_url_fopen" as TRUE in "php.ini" to allow accessing files via HTTP or FTP.
Some hosting venders disable PHP's "allow_url_fopen" flag for security issues.
您必须在“php.ini”中将“allow_url_fopen”设置为 TRUE 以允许通过 HTTP 或 FTP 访问文件。
一些托管供应商出于安全问题禁用了 PHP 的“allow_url_fopen”标志。
回答by mr.buzz
$post = curl_init();
curl_setopt($post, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($post, CURLOPT_AUTOREFERER, TRUE);
curl_setopt($post, CURLOPT_HEADER, 0);
curl_setopt($post,CURLOPT_RETURNTRANSFER, true);
curl_setopt($post,CURLOPT_URL,$website);
curl_setopt($post,CURLOPT_POST,1);
curl_setopt($post,CURLOPT_POSTFIELDS,"regno=$Number");
curl_setopt($post, CURLOPT_FOLLOWLOCATION, True);
curl_getinfo($post, CURLINFO_HTTP_CODE);
$curlresponse = curl_exec($post);
curl_close($post);
$dom = new DOMDocument();
$dom->loadHTML($curlresponse);
DOMDocument::loadHTML() [domdocument.loadhtml]: htmlParseStartTag: misplaced THIS IS URL : http://www.annauniv.edu/cgi-bin/result/cgrade.pl?regno=11210104001
DOMDocument::loadHTML() [domdocument.loadhtml]: htmlParseStartTag: 放错了这个 URL : http://www.annauniv.edu/cgi-bin/result/cgrade.pl?regno=11210104001