Javascript Regex 替换不在 html 属性中的文本
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/5904914/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Javascript Regex to replace text NOT in html attributes
提问by m14t
I'd like a Javascript Regex to wrap a given list of of words in a given start (<span>
) and end tag (i.e. </span>
), but only if the word is actually "visible text" on the page, and not inside of an html attribute (such as a link's title tag, or inside of a <script></script>
block.
我想要一个 Javascript 正则表达式在给定的开始 ( <span>
) 和结束标记 ( ie </span>
) 中包装给定的单词列表,但前提是该单词实际上是页面上的“可见文本”,而不是在 html 属性内(例如链接的标题标签,或<script></script>
块内。
I've created a JS Fiddle with the basics setup: http://jsfiddle.net/4YCR6/1/
我已经创建了一个基本设置的 JS Fiddle:http: //jsfiddle.net/4YCR6/1/
回答by T.J. Crowder
HTML is too complex to reliably parse with a regular expression.
HTML 太复杂,无法使用正则表达式进行可靠的解析。
If you're looking to do this client-side, you can create a document fragment and/or disconnected DOM node (neither of which is displayed anywhere) and initialize it with your HTML string, then walk through the resulting DOM tree and process the text nodes. (Or use a library to help you do that, although it's actually quite simple.)
如果您希望在客户端执行此操作,您可以创建一个文档片段和/或断开连接的 DOM 节点(它们都不会显示在任何地方)并使用您的 HTML 字符串对其进行初始化,然后遍历生成的 DOM 树并处理文本节点。(或者使用一个库来帮助你做到这一点,虽然它实际上很简单。)
Here's a DOM walking example. This example is slightlysimpler than your problem because it just updates the text, it doesn't add new elements to the structure (wrapping parts of the text in span
s involves updating the structure), but it should get you going. Notes on what you'll need to change at the end.
这是一个 DOM 行走示例。这个例子比你的问题稍微简单,因为它只是更新文本,它不会向结构添加新元素(在span
s 中包装部分文本涉及更新结构),但它应该让你开始。关于最后需要更改的内容的注释。
var html =
"<p>This is a test.</p>" +
"<form><input type='text' value='test value'></form>" +
"<p class='testing test'>Testing here too</p>";
var frag = document.createDocumentFragment();
var body = document.createElement('body');
var node, next;
// Turn the HTML string into a DOM tree
body.innerHTML = html;
// Walk the dom looking for the given text in text nodes
walk(body);
// Insert the result into the current document via a fragment
node = body.firstChild;
while (node) {
next = node.nextSibling;
frag.appendChild(node);
node = next;
}
document.body.appendChild(frag);
// Our walker function
function walk(node) {
var child, next;
switch (node.nodeType) {
case 1: // Element
case 9: // Document
case 11: // Document fragment
child = node.firstChild;
while (child) {
next = child.nextSibling;
walk(child);
child = next;
}
break;
case 3: // Text node
handleText(node);
break;
}
}
function handleText(textNode) {
textNode.nodeValue = textNode.nodeValue.replace(/test/gi, "TEST");
}
The changes you'll need to make will be in handleText
. Specifically, rather than updating nodeValue
, you'll need to:
您需要进行的更改将在handleText
. 具体来说,nodeValue
您需要:而不是更新:
- Find the index of the beginning of each word within the
nodeValue
string. - Use
Node#splitText
to split the text node into up to three text nodes (the part before your matching text, the part that isyour matching text, and the part following your matching text). - Use
document.createElement
to create the newspan
(this is literally justspan = document.createElement('span')
). - Use
Node#insertBefore
to insert the newspan
in front of the third text node (the one containing the text following your matched text); it's okay if you didn't need to create a third node because your matched text was at the end of the text node, just pass innull
as therefChild
. - Use
Node#appendChild
to move the second text node (the one with the matching text) into thespan
. (No need to remove it from its parent first;appendChild
does that for you.)
- 查找
nodeValue
字符串中每个单词开头的索引。 - 使用
Node#splitText
该文本节点分成最多三个文本节点(您匹配的文本之前的部分,该部分是你匹配的文本,并按照您的匹配文本的部分)。 - 使用
document.createElement
以创建新的span
(这是真的只是span = document.createElement('span')
)。 - 用于在第三个文本节点(包含匹配文本之后的文本的节点)前面
Node#insertBefore
插入新的span
;如果您不需要创建第三个节点也没关系,因为匹配的文本位于文本节点的末尾,只需null
作为refChild
. - 使用
Node#appendChild
所述第二文本节点(具有匹配的文本)移入span
。(无需先将其从其父项中删除;appendChild
为您执行此操作。)
回答by Tim Down
T.J. Crowder's answeris correct. I've gone a little further code-wise: here's a fully-formed example that works in all major browsers. I've posted variations of this code on Stack Overflow before (hereand here, for example), and made it nice and generic so I (or anyone else) don't have to change it much to reuse it.
TJ Crowder 的回答是正确的。我在代码方面更进一步:这是一个适用于所有主要浏览器的完整示例。我之前已经在 Stack Overflow 上发布了此代码的变体(例如,这里和这里),并使其变得美观和通用,因此我(或其他任何人)不必对其进行太多更改即可重用它。
jsFiddle example: http://jsfiddle.net/7Vf5J/38/
jsFiddle 示例:http: //jsfiddle.net/7Vf5J/38/
Code:
代码:
// Reusable generic function
function surroundInElement(el, regex, surrounderCreateFunc) {
// script and style elements are left alone
if (!/^(script|style)$/.test(el.tagName)) {
var child = el.lastChild;
while (child) {
if (child.nodeType == 1) {
surroundInElement(child, regex, surrounderCreateFunc);
} else if (child.nodeType == 3) {
surroundMatchingText(child, regex, surrounderCreateFunc);
}
child = child.previousSibling;
}
}
}
// Reusable generic function
function surroundMatchingText(textNode, regex, surrounderCreateFunc) {
var parent = textNode.parentNode;
var result, surroundingNode, matchedTextNode, matchLength, matchedText;
while ( textNode && (result = regex.exec(textNode.data)) ) {
matchedTextNode = textNode.splitText(result.index);
matchedText = result[0];
matchLength = matchedText.length;
textNode = (matchedTextNode.length > matchLength) ?
matchedTextNode.splitText(matchLength) : null;
// Ensure searching starts at the beginning of the text node
regex.lastIndex = 0;
surroundingNode = surrounderCreateFunc(matchedTextNode.cloneNode(true));
parent.insertBefore(surroundingNode, matchedTextNode);
parent.removeChild(matchedTextNode);
}
}
// This function does the surrounding for every matched piece of text
// and can be customized to do what you like
function createSpan(matchedTextNode) {
var el = document.createElement("span");
el.style.color = "red";
el.appendChild(matchedTextNode);
return el;
}
// The main function
function wrapWords(container, words) {
// Replace the words one at a time to ensure "test2" gets matched
for (var i = 0, len = words.length; i < len; ++i) {
surroundInElement(container, new RegExp(words[i]), createSpan);
}
}
wrapWords(document.getElementById("container"), ["test2", "test"]);