PHP DOMDocument::loadHTML() [domdocument.loadhtml]: htmlParseEntityRef: 实体中没有名称
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/12328322/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
PHP DOMDocument::loadHTML() [domdocument.loadhtml]: htmlParseEntityRef: no name in Entity
提问by David
I trying to get the "link" elements from certain webpages. I can't figure out what i'm doing wrong though. I'm getting the following error:
我试图从某些网页中获取“链接”元素。我无法弄清楚我做错了什么。我收到以下错误:
Severity: Warning
Message: DOMDocument::loadHTML() [domdocument.loadhtml]: htmlParseEntityRef: no name in Entity, line: 536
Filename: controllers/test.php
Line Number: 34
严重性:警告
消息:DOMDocument::loadHTML() [domdocument.loadhtml]:htmlParseEntityRef:实体中没有名称,行:536
文件名:controllers/test.php
行号:34
Line 34 is the following in the code:
代码中的第 34 行如下:
$dom->loadHTML($html);
$dom->loadHTML($html);
my code:
我的代码:
$url = "http://www.amazon.com/";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10);
if($html = curl_exec($ch)){
// parse the html into a DOMDocument
$dom = new DOMDocument();
$dom->recover = true;
$dom->strictErrorChecking = false;
$dom->loadHTML($html);
$hrefs = $dom->getElementsByTagName('a');
echo "<pre>";
print_r($hrefs);
echo "</pre>";
curl_close($ch);
}else{
echo "The website could not be reached.";
}
回答by Kris
It means some of the HTML code is invalid. THis is just a warning, not an error. Your script will still process it. To suppress the warnings set
这意味着某些 HTML 代码无效。这只是一个警告,而不是一个错误。您的脚本仍将处理它。抑制警告集
libxml_use_internal_errors(true);
Or you could just completely suppress the warning by doing
或者您可以通过执行以下操作来完全抑制警告
@$dom->loadHTML($html);
回答by Ujjwal Singh
This may be caused by a rogue &symbol that is immediately succeeded by a proper tag. As otherwise you would receive a missing ;error. See: Warning: DOMDocument::loadHTML(): htmlParseEntityRef: expecting ';' in Entity,.
这可能是由一个流氓&符号引起的,该符号立即被正确的标记接住。否则你会收到一个丢失的;错误。请参阅:警告:DOMDocument::loadHTML(): htmlParseEntityRef: 期望 ';' 在实体中,。
The solution is to - replace the &symbol with &
or if you must have that &as it is then, may beyou could enclose it in: <![CDATA[- ]]>
解决的办法是-更换&用符号&
,或者如果你必须有&,因为它是的话,可能是你可以在它括:<![CDATA[-]]>
回答by DeltaLee
The HTML is poorly formed. If formed poorly enough loading the HTML into the DOM Document might even fail. If loadHTML is not working then suppressing the errors is pointless. I suggest using a tool like HTML Tidy to "clean up" the poorly formed HTML if you are unable to load the HTML into the DOM.
HTML 格式不佳。如果格式不够好,将 HTML 加载到 DOM 文档甚至可能会失败。如果 loadHTML 不起作用,那么抑制错误是没有意义的。如果您无法将 HTML 加载到 DOM 中,我建议使用像 HTML Tidy 这样的工具来“清理”格式不佳的 HTML。
HTML Tidy can be found here http://www.htacg.org/tidy-html5/
HTML Tidy 可以在这里找到http://www.htacg.org/tidy-html5/

