php 如何删除html特殊字符?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/657643/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-24 23:23:50  来源:igfitidea点击:

How to remove html special chars?

phphtml-encode

提问by Prashant

I am creating a RSS feed file for my application in which I want to remove HTML tags, which is done by strip_tags. But strip_tagsis not removing HTML special code chars:

我正在为我的应用程序创建一个 RSS 提要文件,我想在其中删除 HTML 标签,这是由strip_tags. 但strip_tags不是删除 HTML 特殊代码字符:

  & © 

etc.

等等。

Please tell me any function which I can use to remove these special code chars from my string.

请告诉我可用于从字符串中删除这些特殊代码字符的任何函数。

回答by schnaader

Either decode them using html_entity_decodeor remove them using preg_replace:

使用以下方法解码html_entity_decode或删除它们preg_replace

$Content = preg_replace("/&#?[a-z0-9]+;/i","",$Content); 

(From here)

(从这里

EDIT: Alternative according to Jacco's comment

编辑:根据Jacco的评论替代

might be nice to replace the '+' with {2,8} or something. This will limit the chance of replacing entire sentences when an unencoded '&' is present.

用 {2,8} 或其他东西替换 '+' 可能会很好。当存在未编码的“&”时,这将限制替换整个句子的机会。

$Content = preg_replace("/&#?[a-z0-9]{2,8};/i","",$Content); 

回答by andi

Use html_entity_decodeto convert HTML entities.

使用html_entity_decode转换HTML实体。

You'll need to set charset to make it work correctly.

您需要设置字符集以使其正常工作。

回答by gpkamp

In addition to the good answers above, PHP also has a built-in filter function that is quite useful: filter-var.

除了上面的好答案,PHP 还有一个内置的过滤器函数,非常有用:filter-var。

To remove HMTL characters, use:

要删除 HMTL 字符,请使用:

$cleanString = filter_var($dirtyString, FILTER_SANITIZE_STRING);

$cleanString = filter_var($dirtyString, FILTER_SANITIZE_STRING);

More info:

更多信息:

  1. function.filter-var
  2. filter_sanitize_string
  1. function.filter-var
  2. filter_sanitize_string

回答by 0xFF

You may want take a look at htmlentities() and html_entity_decode() here

您可能想在这里查看 htmlentities() 和 html_entity_decode()

$orig = "I'll \"walk\" the <b>dog</b> now";

$a = htmlentities($orig);

$b = html_entity_decode($a);

echo $a; // I'll &quot;walk&quot; the &lt;b&gt;dog&lt;/b&gt; now

echo $b; // I'll "walk" the <b>dog</b> now

回答by Vinit Kadkol

This might work well to remove special characters.

这可能会很好地删除特殊字符。

$modifiedString = preg_replace("/[^a-zA-Z0-9_.-\s]/", "", $content); 

回答by Gwapz Juan

What I have done was to use: html_entity_decode, then use strip_tagsto removed them.

我所做的是使用:html_entity_decode,然后使用strip_tags删除它们。

回答by RaGu

try this

尝试这个

<?php
$str = "\x8F!!!";

// Outputs an empty string
echo htmlentities($str, ENT_QUOTES, "UTF-8");

// Outputs "!!!"
echo htmlentities($str, ENT_QUOTES | ENT_IGNORE, "UTF-8");
?>

回答by karim79

A plain vanilla strings way to do it without engaging the preg regex engine:

在不使用 preg 正则表达式引擎的情况下,一种简单的字符串方式来做到这一点:

function remEntities($str) {
  if(substr_count($str, '&') && substr_count($str, ';')) {
    // Find amper
    $amp_pos = strpos($str, '&');
    //Find the ;
    $semi_pos = strpos($str, ';');
    // Only if the ; is after the &
    if($semi_pos > $amp_pos) {
      //is a HTML entity, try to remove
      $tmp = substr($str, 0, $amp_pos);
      $tmp = $tmp. substr($str, $semi_pos + 1, strlen($str));
      $str = $tmp;
      //Has another entity in it?
      if(substr_count($str, '&') && substr_count($str, ';'))
        $str = remEntities($tmp);
    }
  }
  return $str;
}

回答by Jacco

It looks like what you really want is:

看起来你真正想要的是:

function xmlEntities($string) {
    $translationTable = get_html_translation_table(HTML_ENTITIES, ENT_QUOTES);

    foreach ($translationTable as $char => $entity) {
        $from[] = $entity;
        $to[] = '&#'.ord($char).';';
    }
    return str_replace($from, $to, $string);
}

It replaces the named-entities with their number-equivalent.

它将命名实体替换为它们的等价物。

回答by jahanzaib

<?php
function strip_only($str, $tags, $stripContent = false) {
    $content = '';
    if(!is_array($tags)) {
        $tags = (strpos($str, '>') !== false
                 ? explode('>', str_replace('<', '', $tags))
                 : array($tags));
        if(end($tags) == '') array_pop($tags);
    }
    foreach($tags as $tag) {
        if ($stripContent)
             $content = '(.+</'.$tag.'[^>]*>|)';
         $str = preg_replace('#</?'.$tag.'[^>]*>'.$content.'#is', '', $str);
    }
    return $str;
}

$str = '<font color="red">red</font> text';
$tags = 'font';
$a = strip_only($str, $tags); // red text
$b = strip_only($str, $tags, true); // text
?>