JavaScript RegExp 匹配文本忽略 HTML

Question

提问by Francisc

Is it possible to match "the dog is really really fat" in "The dog is really really fat!" and add "WHAT WAS MATCHED" around it?

是否可以在“ ”中匹配“狗真的很胖The dog is really really fat!”并WHAT WAS MATCHED在其周围加上“ ”？

I don't mean this specifically, but generally be able to search text ignoring HTML, keeping it in the end result, and just add the span above around it all?

我不是这个意思，但通常能够搜索文本而忽略 HTML，将其保留在最终结果中，然后在上面添加跨度？

EDIT:
Considering the HTML tag overlapping problem, would it be possible to match a phrase and just add the span around each of the matched words? The problem here is that I don't want the word "dog" matched when it's not in the searched context, in this case, "the dog is really really fat."

编辑：
考虑到 HTML 标签重叠问题，是否可以匹配一个短语并在每个匹配的单词周围添加跨度？这里的问题是，当“狗”这个词不在搜索的上下文中时，我不希望它匹配，在这种情况下，“狗真的很胖”。

Answer 1

回答by Briguy37

Update:

更新：

Here is a working fiddle that does what you want. However, you will need to update the htmlTagRegExto handle matching on any HTML tag, as this just performs a simple match and will not handle all the cases.

这是一个可以完成您想要的工作的小提琴。但是，您需要更新htmlTagRegEx以处理任何 HTML 标签上的匹配，因为这只是执行简单的匹配，不会处理所有情况。

http://jsfiddle.net/briguy37/JyL4J/

Also, below is the code. Basically, it takes out the html elements one by one, then does a replace in the text to add the highlight span around the matched selection, and then pushes back in the html elements one by one. It's ugly, but it's the easiest way I could think of to get it to work...

另外，下面是代码。基本上就是将html元素一一取出，然后在文本中进行替换以在匹配的选择周围添加高亮跨度，然后将html元素一一推回。这很丑陋，但这是我能想到的让它工作的最简单方法......

function highlightInElement(elementId, text){
    var elementHtml = document.getElementById(elementId).innerHTML;
    var tags = [];
    var tagLocations= [];
    var htmlTagRegEx = /<{1}\/{0,1}\w+>{1}/;

    //Strip the tags from the elementHtml and keep track of them
    var htmlTag;
    while(htmlTag = elementHtml.match(htmlTagRegEx)){
        tagLocations[tagLocations.length] = elementHtml.search(htmlTagRegEx);
        tags[tags.length] = htmlTag;
        elementHtml = elementHtml.replace(htmlTag, '');
    }

    //Search for the text in the stripped html
    var textLocation = elementHtml.search(text);
    if(textLocation){
        //Add the highlight
        var highlightHTMLStart = '<span class="highlight">';
        var highlightHTMLEnd = '</span>';
        elementHtml = elementHtml.replace(text, highlightHTMLStart + text + highlightHTMLEnd);

        //plug back in the HTML tags
        var textEndLocation = textLocation + text.length;
        for(i=tagLocations.length-1; i>=0; i--){
            var location = tagLocations[i];
            if(location > textEndLocation){
                location += highlightHTMLStart.length + highlightHTMLEnd.length;
            } else if(location > textLocation){
                location += highlightHTMLStart.length;
            }
            elementHtml = elementHtml.substring(0,location) + tags[i] + elementHtml.substring(location);
        }
    }

    //Update the innerHTML of the element
    document.getElementById(elementId).innerHTML = elementHtml;
}

Answer 2

回答by Ivan Nikolchov

Naah... just use the good old RegExp ;)

Naah...只需使用旧的 RegExp ;)

var htmlString = "The <strong>dog</strong> is really <em>really</em> fat!";
var regexp = /<\/?\w+((\s+\w+(\s*=\s*(?:\".*?"|'.*?'|[^'\">\s]+))?)+\s*|\s*)\/?>/gi;
var result = '<span class="highlight">' + htmlString.replace(regexp, '') + '</span>';

Answer 3

回答by Eliecer Chicott

A simpler way with JQuery would be.

使用 JQuery 的一种更简单的方法是。

originalHtml = $("#div").html();

    newHtml = originalHtml.replace(new RegExp(keyword + "(?![^<>]*>)", "g"), function(e){
                      return "<span class='highlight'>" + e + "</span>";
                   });

$("#div").html(newHtml);

This works just fine for me.

这对我来说很好用。

Answer 4

回答by Roy van Arem

Here is a working regex example to exclude matches inside html tags as well as javascripts:

这是一个有效的正则表达式示例，用于排除 html 标签和 javascripts 中的匹配项：

http://refiddle.com/lwy6

Use this regex in a replace() script.

在 replace() 脚本中使用此正则表达式。

    /(a)(?!([^<])*?>)(?!<script[^>]*?>)(?![^<]*?<\/script>|$)/gi

Answer 5

回答by bluesman

You can use string replace with this expression </?\w*>and you'll get your string

你可以使用字符串替换这个表达式</?\w*>，你会得到你的字符串

Answer 6

回答by Jacob

If you use jQuery, you can use the textproperty on the element containing the text you're searching for. Given this markup:

如果您使用 jQuery，则可以text在包含您要搜索的文本的元素上使用该属性。鉴于此标记：

<p id="the-text">
  The <strong>dog</strong> is really <em>really</em> fat!
</p>

This would yield "The dog is really really fat!":

这将产生“这只狗真的很胖！”：

$('#the-text').text();

You could do your regex search on that text instead of trying to do so in the markup.

您可以对该文本进行正则表达式搜索，而不是尝试在标记中进行搜索。

Without jQuery, I'm unsure of an easy way to extract and concatenate the text nodes from all child elements.

如果没有 jQuery，我不确定从所有子元素中提取和连接文本节点的简单方法。

JavaScript RegExp 匹配文本忽略 HTML

提问by Francisc

回答by Briguy37

回答by Ivan Nikolchov

回答by Eliecer Chicott

回答by Roy van Arem

回答by bluesman

回答by Jacob

相关推荐

最近更新

标签

JavaScript RegExp 匹配文本忽略 HTML

提问by Francisc

回答by Briguy37

回答by Ivan Nikolchov

回答by Eliecer Chicott

回答by Roy van Arem

回答by bluesman

回答by Jacob

相关推荐

javascript 使用 jQuery 从 div 中删除特定内容？

javascript 如何使用javascript动态添加和删除Div标签

javascript Backbone model.destroy() 调用错误回调函数，即使它工作正常？

javascript d3-js 的 Force-Directed Layout 是否支持图像作为节点？

相关推荐

最近更新

标签