php 警告：DOMDocument::loadHTML(): htmlParseEntityRef: 期待 ';' 在实体中，

Question

提问by gweg

$html = file_get_contents("http://www.somesite.com/");

$dom = new DOMDocument();
$dom->loadHTML($html);

echo $dom;

throws

投掷

Warning: DOMDocument::loadHTML(): htmlParseEntityRef: expecting ';' in Entity,
Catchable fatal error: Object of class DOMDocument could not be converted to string in test.php on line 10

Answer 1

回答by Dewsworld

To evaporate the warning, you can use libxml_use_internal_errors(true)

要蒸发警告，您可以使用 libxml_use_internal_errors(true)

// create new DOMDocument
$document = new \DOMDocument('1.0', 'UTF-8');

// set error level
$internalErrors = libxml_use_internal_errors(true);

// load HTML
$document->loadHTML($html);

// Restore error level
libxml_use_internal_errors($internalErrors);

Answer 2

回答by mattalxndr

I would bet that if you looked at the source of http://www.somesite.com/you would find special characters that haven't been converted to HTML. Maybe something like this:

我敢打赌，如果您查看源代码，http://www.somesite.com/您会发现尚未转换为 HTML 的特殊字符。也许是这样的：

<a href="/script.php?foo=bar&hello=world">link</a>

Should be

应该

<a href="/script.php?foo=bar&amp;hello=world">link</a>

Answer 3

回答by Maanas Royy

$dom->@loadHTML($html);

This is incorrect, use this instead:

这是不正确的，请改用它：

@$dom->loadHTML($html);

Answer 4

回答by user279583

There are 2 errors: the second is because $dom is no string but an object and thus cannot be "echoed". The first error is a warning from loadHTML, caused by invalid syntax of the html document to load (probably an &(ampersand) used as parameter separator and not masked as entity with &).

有两个错误：第二个是因为 $dom 不是字符串而是一个对象，因此不能“回显”。第一个错误是来自 loadHTML 的警告，这是由要加载的 html 文档的无效语法引起的（可能是用作参数分隔符的&（与号），而不是用 & 屏蔽为实体）。

You ignore and supress this error message (not the error, just the message!) by calling the function with the error control operator "@" (http://www.php.net/manual/en/language.operators.errorcontrol.php)

通过使用错误控制运算符“@”（http://www.php.net/manual/en/language.operators.errorcontrol. php)

@$dom->loadHTML($html);

Answer 5

回答by Mike B

The reason for your fatal error is DOMDocumentdoes not have a __toString() method and thus can not be echo'ed.

您致命错误的原因是DOMDocument没有 __toString() 方法，因此无法回显。

You're probably looking for

你可能正在寻找

echo $dom->saveHTML();

Answer 6

回答by Lorenz Lo Sauer

Regardless of the echo (which would need to be replaced with print_r or var_dump), if an exception is thrown the object should stay empty:

不管回声（需要用 print_r 或 var_dump 替换），如果抛出异常，对象应该保持为空：

DOMNodeList Object
(
)

Solution

解决方案

Set recoverto true, and strictErrorCheckingto false

$content = file_get_contents($url);

$doc = new DOMDocument();
$doc->recover = true;
$doc->strictErrorChecking = false;
$doc->loadHTML($content);

Use php's entity-encoding on the markup's contents, which is a most common error source.

设置recover为真，并strictErrorChecking为假

$content = file_get_contents($url);

$doc = new DOMDocument();
$doc->recover = true;
$doc->strictErrorChecking = false;
$doc->loadHTML($content);

对标记的内容使用 php 的实体编码，这是最常见的错误源。

Answer 7

回答by David Chan

replace the simple

替换简单的

$dom->loadHTML($html);

with the more robust ...

随着更强大...

libxml_use_internal_errors(true);

if (!$DOM->loadHTML($page))
    {
        $errors="";
        foreach (libxml_get_errors() as $error)  {
            $errors.=$error->message."<br/>";
        }
        libxml_clear_errors();
        print "libxml errors:<br>$errors";
        return;
    }

Answer 8

回答by nmwi22

$html = file_get_contents("http://www.somesite.com/");

$dom = new DOMDocument();
$dom->loadHTML(htmlspecialchars($html));

echo $dom;

try this

尝试这个

Answer 9

回答by Nicolas Bouvrette

I know this is an old question, but if you ever want ot fix the malformed '&' signs in your HTML. You can use code similar to this:

我知道这是一个老问题，但是如果您想修复 HTML 中格式错误的“&”符号。您可以使用与此类似的代码：

$page = file_get_contents('http://www.example.com');
$page = preg_replace('/\s+/', ' ', trim($page));
fixAmps($page, 0);
$dom->loadHTML($page);


function fixAmps(&$html, $offset) {
    $positionAmp = strpos($html, '&', $offset);
    $positionSemiColumn = strpos($html, ';', $positionAmp+1);

    $string = substr($html, $positionAmp, $positionSemiColumn-$positionAmp+1);

    if ($positionAmp !== false) { // If an '&' can be found.
        if ($positionSemiColumn === false) { // If no ';' can be found.
            $html = substr_replace($html, '&amp;', $positionAmp, 1); // Replace straight away.
        } else if (preg_match('/&(#[0-9]+|[A-Z|a-z|0-9]+);/', $string) === 0) { // If a standard escape cannot be found.
            $html = substr_replace($html, '&amp;', $positionAmp, 1); // This mean we need to escape the '&' sign.
            fixAmps($html, $positionAmp+5); // Recursive call from the new position.
        } else {
            fixAmps($html, $positionAmp+1); // Recursive call from the new position.
        }
    }
}

Answer 10

回答by lastYorsh

Another possibile solution is

另一种可能的解决方案是

$sContent = htmlspecialchars($sHTML);
$oDom = new DOMDocument();
$oDom->loadHTML($sContent);
echo html_entity_decode($oDom->saveHTML());

php 警告：DOMDocument::loadHTML(): htmlParseEntityRef: 期待 ';' 在实体中，

提问by gweg

回答by Dewsworld

回答by mattalxndr

回答by Maanas Royy

回答by user279583

回答by Mike B

回答by Lorenz Lo Sauer

回答by David Chan

回答by nmwi22

回答by Nicolas Bouvrette

回答by lastYorsh

相关推荐

最近更新

标签

php 警告：DOMDocument::loadHTML(): htmlParseEntityRef: 期待 ';' 在实体中，

提问by gweg

回答by Dewsworld

回答by mattalxndr

回答by Maanas Royy

回答by user279583

回答by Mike B

回答by Lorenz Lo Sauer

回答by David Chan

回答by nmwi22

回答by Nicolas Bouvrette

回答by lastYorsh

相关推荐

PHP 函数用 & 符号开头是什么意思？

php 使用 DOMdocument() 方法通过 ClassName 获取元素

php PDO bindParam 与执行

php 如何格式化原子日期时间

相关推荐

最近更新

标签