通过 DomDocument (PHP) 加载格式不正确的 HTML 时禁用警告

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1148928/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-25 01:14:26  来源:igfitidea点击:

Disable warnings when loading non-well-formed HTML by DomDocument (PHP)

phphtmlwarningsdomdocument

提问by Viet

I need to parse some HTML files, however, they are not well-formed and PHP prints out warnings to. I want to avoid such debugging/warning behavior programatically. Please advise. Thank you!

我需要解析一些 HTML 文件,但是,它们格式不正确,PHP 会向其打印警告。我想以编程方式避免这种调试/警告行为。请指教。谢谢!

Code:

代码:

// create a DOM document and load the HTML data
$xmlDoc = new DomDocument;
// this dumps out the warnings
$xmlDoc->loadHTML($fetchResult);

This:

这个:

@$xmlDoc->loadHTML($fetchResult)

can suppress the warnings but how can I capture those warnings programatically?

可以抑制警告,但如何以编程方式捕获这些警告?

采纳答案by troelskn

You can install a temporary error handler with set_error_handler

您可以使用以下命令安装临时错误处理程序 set_error_handler

class ErrorTrap {
  protected $callback;
  protected $errors = array();
  function __construct($callback) {
    $this->callback = $callback;
  }
  function call() {
    $result = null;
    set_error_handler(array($this, 'onError'));
    try {
      $result = call_user_func_array($this->callback, func_get_args());
    } catch (Exception $ex) {
      restore_error_handler();        
      throw $ex;
    }
    restore_error_handler();
    return $result;
  }
  function onError($errno, $errstr, $errfile, $errline) {
    $this->errors[] = array($errno, $errstr, $errfile, $errline);
  }
  function ok() {
    return count($this->errors) === 0;
  }
  function errors() {
    return $this->errors;
  }
}

Usage:

用法:

// create a DOM document and load the HTML data
$xmlDoc = new DomDocument();
$caller = new ErrorTrap(array($xmlDoc, 'loadHTML'));
// this doesn't dump out any warnings
$caller->call($fetchResult);
if (!$caller->ok()) {
  var_dump($caller->errors());
}

回答by thomasrutter

Call

称呼

libxml_use_internal_errors(true);

prior to processing with with $xmlDoc->loadHTML()

在与处理之前 $xmlDoc->loadHTML()

This tells libxml2 not to senderrors and warnings through to PHP. Then, to check for errors and handle them yourself, you can consult libxml_get_last_error()and/or libxml_get_errors()when you're ready.

这告诉 libxml2不要向 PHP发送错误和警告。然后,要检查错误并自己处理它们,您可以在准备好后咨询libxml_get_last_error()和/或libxml_get_errors()

回答by Ja?ck

To hide the warnings, you have to give special instructions to libxmlwhich is used internally to perform the parsing:

要隐藏警告,您必须给出libxml内部使用的特殊指令来执行解析:

libxml_use_internal_errors(true);
$dom->loadHTML($html);
libxml_clear_errors();

The libxml_use_internal_errors(true)indicates that you're going to handle the errors and warnings yourself and you don't want them to mess up the output of your script.

libxml_use_internal_errors(true)表明你要处理的错误和警告自己,你不想让它们弄乱你的脚本的输出。

This is not the same as the @operator. The warnings get collected behind the scenes and afterwards you can retrieve them by using libxml_get_errors()in case you wish to perform logging or return the list of issues to the caller.

这与@运营商不一样。警告会在幕后收集,然后您可以通过使用来检索它们libxml_get_errors(),以防您希望执行日志记录或将问题列表返回给调用者。

Whether or not you're using the collected warnings you should always clear the queue by calling libxml_clear_errors().

无论您是否使用收集到的警告,您都应该始终通过调用libxml_clear_errors().

Preserving the state

保存状态

If you have other code that uses libxmlit may be worthwhile to make sure your code doesn't alter the globalstate of the error handling; for this, you can use the return value of libxml_use_internal_errors()to save the previous state.

如果您有其他代码使用libxml它,确保您的代码不会改变错误处理的全局状态可能是值得的;为此,您可以使用 的返回值libxml_use_internal_errors()来保存之前的状态。

// modify state
$libxml_previous_state = libxml_use_internal_errors(true);
// parse
$dom->loadHTML($html);
// handle errors
libxml_clear_errors();
// restore
libxml_use_internal_errors($libxml_previous_state);

回答by Joshua Ott

Setting the options "LIBXML_NOWARNING" & "LIBXML_NOERROR" works perfectly fine too:

设置选项“LIBXML_NOWARNING”和“LIBXML_NOERROR”也可以正常工作:

$dom->loadHTML($html, LIBXML_NOWARNING | LIBXML_NOERROR);