php 中的哪个函数验证字符串是否是有效的 html?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3167074/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-25 08:54:41  来源:igfitidea点击:

Which function in php validate if the string is valid html?

phphtmlvalidation

提问by Yosef

Which function in php validate if the string is html? My target to take input from user and check if input html and not just string.

php 中的哪个函数验证字符串是否为 html?我的目标是从用户那里获取输入并检查是否输入 html 而不仅仅是字符串。

Example for not html string:

非 html 字符串的示例:

sdkjshdk<div>jd</h3>ivdfadfsdf or sdkjshdkivdfadfsdf

Example for html string:

html 字符串示例:

<div>sdfsdfsdf<label>dghdhdgh</label> fdsgfgdfgfd</div>

Thanks

谢谢

回答by Eineki

Maybe you need to check if the string is well formed.

也许您需要检查字符串是否格式正确。

I would use a function like this

我会使用这样的功能

function check($string) {
  $start =strpos($string, '<');
  $end  =strrpos($string, '>',$start);

  $len=strlen($string);

  if ($end !== false) {
    $string = substr($string, $start);
  } else {
    $string = substr($string, $start, $len-$start);
  }
  libxml_use_internal_errors(true);
  libxml_clear_errors();
  $xml = simplexml_load_string($string);
  return count(libxml_get_errors())==0;
}

Just a warning: html permits unbalanced string like the following one. It is not an xml valid chunk but it is a legal html chunk

只是一个警告:html 允许像下面这样的不平衡字符串。它不是一个 xml 有效块,但它是一个合法的 html 块

<ul><li>Hi<li> I'm another li</li></ul>

DisclaimerI've modified the code (without testing it). in order to detect well formed html inside the string.

免责声明我已经修改了代码(没有测试)。为了检测字符串中格式正确的 html。

A last though Maybe you should use strip_tagsto control user input (As I've seen in your comments)

最后虽然也许你应该使用strip_tags来控制用户输入(正如我在你的评论中看到的)

回答by a1ex07

You can use DomDocument's method loadHTML

你可以使用 DomDocument 的方法loadHTML

回答by Diogo Gomes

simplexml_load_stringwill fail if you don't have a single root node. So if you try this html:

simplexml_load_string如果您没有单个根节点,则会失败。所以如果你试试这个 html:

<p>A</p><p>B</p>it will be invalid.

<p>A</p><p>B</p>它将无效。

Here's my function:

这是我的功能:

function check($string){
    $start = strpos($string, '<');
    $end = strrpos($string, '>', $start);

    if ($end !== false) {
        $string = substr($string, $start);
    } else {
        $string = substr($string, $start, strlen($string) - $start);
    }

    // xml requires one root node
    $string = "<div>$string</div>";

    libxml_use_internal_errors(true);
    libxml_clear_errors();
    simplexml_load_string($string);

    return count(libxml_get_errors()) == 0;
}

回答by CaseySoftware

Do you mean HTML or XHTML?

你是说 HTML 还是 XHTML?

The HTML standard and interpretation are so loose that your first snippet might work. It won't be pretty but you might get something.

HTML 标准和解释非常松散,您的第一个代码段可能会起作用。它不会很漂亮,但你可能会得到一些东西

XHTML is quite a bit more strict and at minimumwill expect your snippet to be well-formed (all opened tags are closed; tags can nest but not overlap) and may throw warnings if you have unrecognized elements or attributes.

XHTML 更为严格,至少会期望您的代码段格式正确(所有打开的标签都是关闭的;标签可以嵌套但不能重叠)并且如果您有无法识别的元素或属性,则可能会发出警告。

Something like Tidy - http://php.net/manual/en/book.tidy.php- is probably a good start. Once you load your snippet using that, you can use tidy_error_countor tidy_get_error_bufferto see if it's "okay enough" for your needs.

像 Tidy 这样的东西 - http://php.net/manual/en/book.tidy.php- 可能是一个好的开始。使用它加载代码段后,您可以使用tidy_error_counttidy_get_error_buffer来查看它是否“足够”满足您的需求。

回答by Iznogood

Are you trying to prevent users from posting html tags instead of strings? Cause if this is what you want to do you just need striptags()

您是否试图阻止用户发布 html 标签而不是字符串?因为如果这是你想要做的,你只需要striptags()

Wich will remove any html tags from the string.

Wich 将从字符串中删除任何 html 标签。

回答by Ernesto Valentin Caamal Peech

you should use:

你应该使用:

$html="<html><body><p>This is array.</p><br></body></html>";

libxml_use_internal_errors(true);
$dom = New DOMDocument();
$dom->loadHTML($html);
if (empty(libxml_get_errors())) {
  echo "This is a good HTML";
}else {
  echo "This not html";
}

回答by Ali Asgari

If you want to make your site secure also, you certainly have to use an HTML purifier like htmlpurifier, tidy etc.

如果您还想让您的网站安全,您当然必须使用 HTML 净化器,如 htmlpurifier、tidy 等。