使用 PHP Simple HTML DOM 解析器的奇怪错误

Question

提问by Tsundoku

I am using this library (PHP Simple HTML DOM parser) to parse a link, here's the code:

我正在使用这个库（PHP Simple HTML DOM parser）来解析一个链接，代码如下：

function getSemanticRelevantKeywords($keyword){
    $results = array();
    $html = file_get_html("http://www.semager.de/api/keyword.php?q=". urlencode($keyword) ."&lang=de&out=html&count=2&threshold=");
    foreach($html->find('span') as $e){
            $results[] = $e->plaintext;
    }
    return $results;
}

but I am getting this error when I output the results:

但是当我输出结果时出现此错误：

Fatal error: Call to a member function find() on a non-object in /var/www/vhosts/efamous.de/subdomains/sandbox/httpdocs/getNewTrusts.php on line 25

致命错误：在第 25 行的 /var/www/vhosts/ecious.de/subdomains/sandbox/httpdocs/getNewTrusts.php 中的非对象上调用成员函数 find()

(line 25 is the foreach loop), the odd thing is that it outputs everything (at least seemingly) correctly but I still get that error and can't figure out why.

（第 25 行是 foreach 循环），奇怪的是它输出了所有内容（至少看起来是正确的），但我仍然得到那个错误并且不知道为什么。

Answer 1

采纳答案by Jim

This error usually means that $html isn't an object.

这个错误通常意味着 $html 不是一个对象。

It's odd that you say this seems to work. What happens if you output $html? I'd imagine that the url isn't available and that $html is null.

你说这似乎有效，这很奇怪。如果输出 $html 会发生什么？我想网址不可用并且 $html 为空。

Edit: Looks like this may be an error in the parser. Someone has submitted a bugand added a check in his code as a workaround.

编辑：看起来这可能是解析器中的错误。有人提交了一个错误并在他的代码中添加了一个检查作为解决方法。

Answer 2

回答by Sagar Shetty

The reason for this error is: the simple HTML DOM does not return the object if the size of the response from url is greater than 600000.
You can void it by changing the simple_html_dom.phpfile. Remove strlen($contents) > MAX_FILE_SIZEfrom the ifcondition of the file_get_htmlfunction.
This will solve your issue.

这个错误的原因是：如果来自 url 的响应大小大于 600000，简单的 HTML DOM 不会返回对象。
您可以通过更改simple_html_dom.php文件来取消它。strlen($contents) > MAX_FILE_SIZE从函数的if条件中删除file_get_html。
这将解决您的问题。

Answer 3

回答by LAMPHONGPAUL

You just need to increase CONSTANT MAX_FILE_SIZEin file simple_html_dom.php.

你只需要增加CONSTANT MAX_FILE_SIZE文件simple_html_dom.php。

For example:

例如：

define('MAX_FILE_SIZE', 999999999999999);

Answer 4

回答by trante

Before file_get_html/load_filemethod, you should first check if URL exists or not.

在file_get_html/load_file方法之前，您应该首先检查 URL 是否存在。

If the URL exists, you pass one step.
(Some servers, service a 404 page a valid HTML page. which has propriate HTML page structure like body, head, etc. But it has only text "This page couldn'!t find. 404 error bla bla..)

如果 URL 存在，则通过一个步骤。
（某些服务器为 404 页面提供有效的 HTML 页面。它具有适当的 HTML 页面结构，如正文、头部等。但它只有文本“此页面找不到！404 错误 bla bla ..）

If URL is 200-OK, then you should check whether fetched thing is object and whether nodes are set.

如果 URL 为 200-OK，则应检查获取的事物是否为对象以及是否设置了节点。

That's the code i used in my pages.

这是我在页面中使用的代码。

function url_exists($url){
    if ((strpos($url, "http")) === false) $url = "http://" . $url;
    $headers = @get_headers($url);
    // print_r($headers);
    if (is_array($headers)){
        if(strpos($headers[0], '404 Not Found'))
            return false;
        else
            return true;    
    }         
    else
        return false;
}

$pageAddress='http://www.google.com';
if ( url_exists($pageAddress) ) {
    $htmlPage->load_file( $pageAddress );
} else {
    echo 'url doesn t exist, i stop';
    return;
}

if( $htmlPage && is_object($htmlPage) && isset($htmlPage->nodes) )
{
    // do your work here...
} else {
    echo 'fetched page is not ok, i stop';
    return;
}

Answer 5

回答by futtta

For those arriving here via a search engine (as I did), after reading the info (and linked bug-report) above, I started some code-prodding and ended up fixing my problems with 2 extra checks after loading the dom;

对于那些通过搜索引擎到达这里的人（就像我一样），在阅读了上面的信息（和链接的错误报告）之后，我开始了一些代码生产，并在加载 dom 后通过 2 次额外检查解决了我的问题；

$html = file_get_html('<your url here>');
// first check if $html->find exists
if (method_exists($html,"find")) {
     // then check if the html element exists to avoid trying to parse non-html
     if ($html->find('html')) {
          // and only then start searching (and manipulating) the dom 
     }
}

Answer 6

回答by Eric Strom

I'm having the same error come up in my logs and apart from the solutions mentioned above, it could also be that there is no 'span' in the document. I get the same error when searching for divs with a particular class that doesn't exist on the page, but when searching for something that I know exists on the page, the error doesn't pop up.

我的日志中出现了同样的错误，除了上面提到的解决方案之外，还可能是文档中没有“跨度”。使用页面上不存在的特定类搜索 div 时，我遇到相同的错误，但是当搜索我知道页面上存在的内容时，错误不会弹出。

Answer 7

回答by Tudor

your script is OK. I receive this error when it doase not find the element that i'm looking for on that page.

你的脚本没问题。当它在该页面上找不到我正在寻找的元素时，我会收到此错误。

In your case, please check if the page that you are accessing it has 'SPAN' element

在您的情况下，请检查您正在访问的页面是否具有“SPAN”元素

Answer 8

回答by Cesar Bielich

Simplest solution to this problem

这个问题的最简单的解决方案

if ($html = file_get_html("http://www.semager.de/api/keyword.php?q=". urlencode($keyword) ."&lang=de&out=html&count=2&threshold=") {

} else {
    // do something else because couldn't find html
}

Answer 9

回答by toopay

Error means, the find() function is either not defined yet or not available. Make sure you have loaded or include related function.

错误意味着 find() 函数尚未定义或不可用。确保您已加载或包含相关功能。

使用 PHP Simple HTML DOM 解析器的奇怪错误

提问by Tsundoku

采纳答案by Jim

回答by Sagar Shetty

回答by LAMPHONGPAUL

回答by trante

回答by futtta

回答by Eric Strom

回答by Tudor

回答by Cesar Bielich

回答by toopay

相关推荐

最近更新

标签

使用 PHP Simple HTML DOM 解析器的奇怪错误

提问by Tsundoku

采纳答案by Jim

回答by Sagar Shetty

回答by LAMPHONGPAUL

回答by trante

回答by futtta

回答by Eric Strom

回答by Tudor

回答by Cesar Bielich

回答by toopay

相关推荐

php 使用 Symfony Process 运行后台任务而无需等待进程完成

PHP CLI 上的新行

php Laravel 数据库架构，可空外国

如何在我的 php 脚本中隐藏我的 mySQL 登录信息？

相关推荐

最近更新

标签