php 如何删除html特殊字符?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/657643/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to remove html special chars?
提问by Prashant
I am creating a RSS feed file for my application in which I want to remove HTML tags, which is done by strip_tags. But strip_tagsis not removing HTML special code chars:
我正在为我的应用程序创建一个 RSS 提要文件,我想在其中删除 HTML 标签,这是由strip_tags. 但strip_tags不是删除 HTML 特殊代码字符:
& ©
etc.
等等。
Please tell me any function which I can use to remove these special code chars from my string.
请告诉我可用于从字符串中删除这些特殊代码字符的任何函数。
回答by schnaader
Either decode them using html_entity_decodeor remove them using preg_replace:
使用以下方法解码html_entity_decode或删除它们preg_replace:
$Content = preg_replace("/&#?[a-z0-9]+;/i","",$Content);
(From here)
(从这里)
EDIT: Alternative according to Jacco's comment
编辑:根据Jacco的评论替代
might be nice to replace the '+' with {2,8} or something. This will limit the chance of replacing entire sentences when an unencoded '&' is present.
用 {2,8} 或其他东西替换 '+' 可能会很好。当存在未编码的“&”时,这将限制替换整个句子的机会。
$Content = preg_replace("/&#?[a-z0-9]{2,8};/i","",$Content);
回答by andi
Use html_entity_decodeto convert HTML entities.
使用html_entity_decode转换HTML实体。
You'll need to set charset to make it work correctly.
您需要设置字符集以使其正常工作。
回答by gpkamp
In addition to the good answers above, PHP also has a built-in filter function that is quite useful: filter-var.
除了上面的好答案,PHP 还有一个内置的过滤器函数,非常有用:filter-var。
To remove HMTL characters, use:
要删除 HMTL 字符,请使用:
$cleanString = filter_var($dirtyString, FILTER_SANITIZE_STRING);
$cleanString = filter_var($dirtyString, FILTER_SANITIZE_STRING);
More info:
更多信息:
回答by 0xFF
You may want take a look at htmlentities() and html_entity_decode() here
您可能想在这里查看 htmlentities() 和 html_entity_decode()
$orig = "I'll \"walk\" the <b>dog</b> now";
$a = htmlentities($orig);
$b = html_entity_decode($a);
echo $a; // I'll "walk" the <b>dog</b> now
echo $b; // I'll "walk" the <b>dog</b> now
回答by Vinit Kadkol
This might work well to remove special characters.
这可能会很好地删除特殊字符。
$modifiedString = preg_replace("/[^a-zA-Z0-9_.-\s]/", "", $content);
回答by Gwapz Juan
What I have done was to use: html_entity_decode, then use strip_tagsto removed them.
我所做的是使用:html_entity_decode,然后使用strip_tags删除它们。
回答by RaGu
try this
尝试这个
<?php
$str = "\x8F!!!";
// Outputs an empty string
echo htmlentities($str, ENT_QUOTES, "UTF-8");
// Outputs "!!!"
echo htmlentities($str, ENT_QUOTES | ENT_IGNORE, "UTF-8");
?>
回答by karim79
A plain vanilla strings way to do it without engaging the preg regex engine:
在不使用 preg 正则表达式引擎的情况下,一种简单的字符串方式来做到这一点:
function remEntities($str) {
if(substr_count($str, '&') && substr_count($str, ';')) {
// Find amper
$amp_pos = strpos($str, '&');
//Find the ;
$semi_pos = strpos($str, ';');
// Only if the ; is after the &
if($semi_pos > $amp_pos) {
//is a HTML entity, try to remove
$tmp = substr($str, 0, $amp_pos);
$tmp = $tmp. substr($str, $semi_pos + 1, strlen($str));
$str = $tmp;
//Has another entity in it?
if(substr_count($str, '&') && substr_count($str, ';'))
$str = remEntities($tmp);
}
}
return $str;
}
回答by Jacco
It looks like what you really want is:
看起来你真正想要的是:
function xmlEntities($string) {
$translationTable = get_html_translation_table(HTML_ENTITIES, ENT_QUOTES);
foreach ($translationTable as $char => $entity) {
$from[] = $entity;
$to[] = '&#'.ord($char).';';
}
return str_replace($from, $to, $string);
}
It replaces the named-entities with their number-equivalent.
它将命名实体替换为它们的等价物。
回答by jahanzaib
<?php
function strip_only($str, $tags, $stripContent = false) {
$content = '';
if(!is_array($tags)) {
$tags = (strpos($str, '>') !== false
? explode('>', str_replace('<', '', $tags))
: array($tags));
if(end($tags) == '') array_pop($tags);
}
foreach($tags as $tag) {
if ($stripContent)
$content = '(.+</'.$tag.'[^>]*>|)';
$str = preg_replace('#</?'.$tag.'[^>]*>'.$content.'#is', '', $str);
}
return $str;
}
$str = '<font color="red">red</font> text';
$tags = 'font';
$a = strip_only($str, $tags); // red text
$b = strip_only($str, $tags, true); // text
?>

