如何从 JavaScript 中的字符串中去除 HTML 标签？

Question

提问by f.ardelian

How can I strip the HTML from a string in JavaScript?

如何从 JavaScript 中的字符串中去除 HTML？

Answer 1

回答by ReactiveRaven

cleanText = strInputCode.replace(/<\/?[^>]+(>|$)/g, "");

Distilled from this website (web.achive).

摘自本网站 (web.achive)。

Answer 2

回答by Tim Down

Using the browser's parser is the probably the best bet in current browsers. The following will work, with the following caveats:

使用浏览器的解析器可能是当前浏览器中最好的选择。以下将起作用，但有以下注意事项：

Your HTML is valid within a <div>element. HTML contained within <body>or <html>or <head>tags is not valid within a <div>and may therefore not be parsed correctly.
textContent(the DOM standard property) and innerText(non-standard) properties are not identical. For example, textContentwill include text within a <script>element while innerTextwill not (in most browsers). This only affects IE <=8, which is the only major browser not to support textContent.
The HTML does not contain <script>elements.
The HTML is not null
The HTML comes from a trusted source. Using this with arbitrary HTML allows arbitrary untrusted JavaScript to be executed. This example is from a comment by Mike Samuel on the duplicate question: <img onerror='alert(\"could run arbitrary JS here\")' src=bogus>

您的 HTML 在<div>元素内有效。<body>或<html>或<head>标签中包含的 HTML在 a 中无效<div>，因此可能无法正确解析。
textContent（DOM 标准属性）和innerText（非标准）属性不相同。例如，textContent将在<script>元素中包含文本而innerText不会（在大多数浏览器中）。这仅影响 IE <=8，这是唯一不支持textContent.
HTML 不包含<script>元素。
HTML 不是 null
HTML 来自受信任的来源。将其与任意 HTML 一起使用允许执行任意不受信任的 JavaScript。此示例来自 Mike Samuel 对重复问题的评论：<img onerror='alert(\"could run arbitrary JS here\")' src=bogus>

Code:

代码：

var html = "<p>Some HTML</p>";
var div = document.createElement("div");
div.innerHTML = html;
var text = div.textContent || div.innerText || "";

Answer 3

回答by Felix

var html = "<p>Hello, <b>World</b>";
var div = document.createElement("div");
div.innerHTML = html;
alert(div.innerText); // Hello, World

That pretty much the best way of doing it, you're letting the browser do what it does best -- parse HTML.

这几乎是最好的方法，你让浏览器做它最擅长的事情——解析 HTML。

Edit: As noted in the comments below, this is not the most cross-browser solution. The most cross-browser solution would be to recursively go through all the children of the element and concatenate all text nodes that you find. However, if you're using jQuery, it already does it for you:

编辑：正如下面的评论中所指出的，这不是最跨浏览器的解决方案。最跨浏览器的解决方案是递归遍历元素的所有子元素并连接您找到的所有文本节点。但是，如果您使用的是 jQuery，它已经为您完成了：

alert($("<p>Hello, <b>World</b></p>").text());

Check out the textmethod.

查看文本方法。

Answer 4

回答by Till

I know this question has an accepted answer, but I feel that it doesn't work in all cases.

我知道这个问题有一个公认的答案，但我觉得它并不适用于所有情况。

For completeness and since I spent too much time on this, here is what we did: we ended up using a function from php.js(which is a pretty nice library for those more familiar with PHP but also doing a little JavaScript every now and then):

为了完整起见，并且由于我在这方面花费了太多时间，我们做了以下工作：我们最终使用了php.js 中的一个函数（对于那些更熟悉 PHP 的人来说，这是一个非常好的库，但偶尔也会做一些 JavaScript然后）：

http://phpjs.org/functions/strip_tags:535

It seemed to be the only piece of JavaScript code which successfully dealt with all the different kinds of input I stuffed into my application. That is, without breaking it – see my comments about the <script />tag above.

它似乎是唯一一段成功处理我填入应用程序的所有不同类型输入的 JavaScript 代码。也就是说，在不破坏它的情况下 - 请参阅我对<script />上面标签的评论。

如何从 JavaScript 中的字符串中去除 HTML 标签？

提问by f.ardelian

回答by ReactiveRaven

回答by Tim Down

回答by Felix

回答by Till

相关推荐

最近更新

标签

如何从 JavaScript 中的字符串中去除 HTML 标签？

提问by f.ardelian

回答by ReactiveRaven

回答by Tim Down

回答by Felix

回答by Till

相关推荐

使用 HTML 和 javascript 重置密码

Javascript 更改#anchor 的默认起始位置

Javascript 下载带有 filesaver.js 和 blob 的 pdf

Javascript 警报框在 Chrome 中不起作用

相关推荐

最近更新

标签