如何从 JavaScript 中的字符串中去除 HTML 标签?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/5002111/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to strip HTML tags from string in JavaScript?
提问by f.ardelian
How can I strip the HTML from a string in JavaScript?
如何从 JavaScript 中的字符串中去除 HTML?
回答by ReactiveRaven
cleanText = strInputCode.replace(/<\/?[^>]+(>|$)/g, "");
Distilled from this website (web.achive).
回答by Tim Down
Using the browser's parser is the probably the best bet in current browsers. The following will work, with the following caveats:
使用浏览器的解析器可能是当前浏览器中最好的选择。以下将起作用,但有以下注意事项:
- Your HTML is valid within a
<div>
element. HTML contained within<body>
or<html>
or<head>
tags is not valid within a<div>
and may therefore not be parsed correctly. textContent
(the DOM standard property) andinnerText
(non-standard) properties are not identical. For example,textContent
will include text within a<script>
element whileinnerText
will not (in most browsers). This only affects IE <=8, which is the only major browser not to supporttextContent
.- The HTML does not contain
<script>
elements. - The HTML is not
null
- The HTML comes from a trusted source. Using this with arbitrary HTML allows arbitrary untrusted JavaScript to be executed. This example is from a comment by Mike Samuel on the duplicate question:
<img onerror='alert(\"could run arbitrary JS here\")' src=bogus>
- 您的 HTML 在
<div>
元素内有效。<body>
或<html>
或<head>
标签中包含的 HTML在 a 中无效<div>
,因此可能无法正确解析。 textContent
(DOM 标准属性)和innerText
(非标准)属性不相同。例如,textContent
将在<script>
元素中包含文本而innerText
不会(在大多数浏览器中)。这仅影响 IE <=8,这是唯一不支持textContent
.- HTML 不包含
<script>
元素。 - HTML 不是
null
- HTML 来自受信任的来源。将其与任意 HTML 一起使用允许执行任意不受信任的 JavaScript。此示例来自 Mike Samuel 对重复问题的评论:
<img onerror='alert(\"could run arbitrary JS here\")' src=bogus>
Code:
代码:
var html = "<p>Some HTML</p>";
var div = document.createElement("div");
div.innerHTML = html;
var text = div.textContent || div.innerText || "";
回答by Felix
var html = "<p>Hello, <b>World</b>";
var div = document.createElement("div");
div.innerHTML = html;
alert(div.innerText); // Hello, World
That pretty much the best way of doing it, you're letting the browser do what it does best -- parse HTML.
这几乎是最好的方法,你让浏览器做它最擅长的事情——解析 HTML。
Edit: As noted in the comments below, this is not the most cross-browser solution. The most cross-browser solution would be to recursively go through all the children of the element and concatenate all text nodes that you find. However, if you're using jQuery, it already does it for you:
编辑:正如下面的评论中所指出的,这不是最跨浏览器的解决方案。最跨浏览器的解决方案是递归遍历元素的所有子元素并连接您找到的所有文本节点。但是,如果您使用的是 jQuery,它已经为您完成了:
alert($("<p>Hello, <b>World</b></p>").text());
Check out the textmethod.
查看文本方法。
回答by Till
I know this question has an accepted answer, but I feel that it doesn't work in all cases.
我知道这个问题有一个公认的答案,但我觉得它并不适用于所有情况。
For completeness and since I spent too much time on this, here is what we did: we ended up using a function from php.js(which is a pretty nice library for those more familiar with PHP but also doing a little JavaScript every now and then):
为了完整起见,并且由于我在这方面花费了太多时间,我们做了以下工作:我们最终使用了php.js 中的一个函数(对于那些更熟悉 PHP 的人来说,这是一个非常好的库,但偶尔也会做一些 JavaScript然后):
http://phpjs.org/functions/strip_tags:535
http://phpjs.org/functions/strip_tags:535
It seemed to be the only piece of JavaScript code which successfully dealt with all the different kinds of input I stuffed into my application. That is, without breaking it – see my comments about the <script />
tag above.
它似乎是唯一一段成功处理我填入应用程序的所有不同类型输入的 JavaScript 代码。也就是说,在不破坏它的情况下 - 请参阅我对<script />
上面标签的评论。