javascript 如何使用 Cheerio js 删除 <div> 和 <br>?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/28790458/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-28 09:33:58  来源:igfitidea点击:

How to remove <div> and <br> using Cheerio js?

javascriptnode.jscheerio

提问by bosslee

I have the following html that I like to parse through Cheerios.

我有以下 html,我喜欢通过 Cheerios 解析它们。

    var $ = cheerio.load('<html><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;"><div>This works well.</div><div><br clear="none"/></div><div>So I have been doing this for several hours. How come the space does not split? Thinking that this could be an issue.</div><div>Testing next paragraph.</div><div><br clear="none"/></div><div>Im testing with another post. This post should work.</div><div><br clear="none"/></div><h1>This is for test server.</h1></body></html>', {
    normalizeWhitespace: true,
});

// trying to parse the html
// the goals are to 
// 1. remove all the 'div'
// 2. clean up <br clear="none"/> into <br>
// 3. Have all the new 'empty' element added with 'p'

var testData = $('div').map(function(i, elem) {
    var test = $(elem)
    if ($(elem).has('br')) {
        console.log('spaceme');
        var test2 = $(elem).removeAttr('br');
    } else {
        var test2 = $(elem).removeAttr('div').add('p');
    }
    console.log(i +' '+ test2.html());
    return test2.html()
})

res.send(test2.html())

My end goals are to try and parse the html

我的最终目标是尝试解析 html

  • remove all the div
  • clean up <br clear="none"/>and change into <br>
  • and finally have all the empty 'element' (those sentences with 'div') remove to be added with 'p' sentence '/p'
  • 删除所有的div
  • 清理<br clear="none"/>并更改为<br>
  • 最后将所有空的“元素”(那些带有“div”的句子)删除以添加“p”句子“/p”

I try to start with a smaller goal in the above code I have written. I tried to remove all the 'div' (it is a success) but I'm unable to to find the 'br. I been trying out for days and have no head way.

在我编写的上述代码中,我尝试从一个较小的目标开始。我试图删除所有的“div”(它成功了),但我无法找到“br”。我已经尝试了几天,但没有办法。

So I'm writing here to seek some help and hints on how can I get to my end goal.

所以我写在这里是为了寻求一些关于如何实现最终目标的帮助和提示。

Thank you :D

谢谢:D

回答by adeneo

It's easier than it looks, first you iterate over all the DIV's

这比看起来容易,首先你遍历所有的 DIV

$('div').each(function() { ...

and for each div, you check if it has a <br>tag

对于每个 div,您检查它是否有<br>标签

$(this).find('br').length

if it does, you remove the attribute

如果是,则删除该属性

$(this).find('br').removeAttr('clear');

if not you create a P with the same content

如果不是,则创建一个具有相同内容的 P

var p = $('<p>' + $(this).html() + '</p>');

and then just replace the DIV with the P

然后用 P 替换 DIV

$(this).replaceWith(p);

and output

和输出

res.send($.html());

All together it's

总之就是

$('div').each(function() {
    if ( $(this).find('br').length ) {
        $(this).find('br').removeAttr('clear');
    } else {
        var p = $('<p>' + $(this).html() + '</p>');
        $(this).replaceWith(p);
    }
});

res.send($.html());

回答by Scott

You don't want to remove an attribute you want to remove the tag and so you want to switch removeAttrto remove, like so:

您不想删除要删除标签的属性,因此您想切换removeAttrremove,如下所示:

var testData = $('div').map(function(i, elem) {
    var test = $(elem)
    if ($(elem).has('br')) {
        console.log('spaceme');
        var test2 = $(elem).remove('br');
    } else {
        var test2 = $(elem).remove('div').add('p');
    }
    console.log(i +' '+ test2.html());
    return test2.html()
})