php 简单的 HTML Dom:如何删除元素?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/8227481/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-26 04:12:29  来源:igfitidea点击:

Simple HTML Dom: How to remove elements?

phpdomsimple-html-dom

提问by kasakka

I would like to use Simple HTML DOM to remove all images in an article so I can easily create a small snippet of text for a news ticker but I haven't figured out how to remove elements with it.

我想使用简单的 HTML DOM 删除文章中的所有图像,以便我可以轻松地为新闻快报创建一小段文本,但我还没有弄清楚如何使用它删除元素。

Basically I would do

基本上我会做

  1. Get content as HTML string
  2. Remove all image tags from content
  3. Limit content to x words
  4. Output.
  1. 获取 HTML 字符串形式的内容
  2. 从内容中删除所有图像标签
  3. 将内容限制为 x 个字
  4. 输出。

Any help?

有什么帮助吗?

回答by Gordon

There is no dedicated methods for removing elements. You just find all the img elements and then do

没有删除元素的专用方法。您只需找到所有 img 元素然后执行

$e->outertext = '';

回答by Dr. Reshef

when you only delete the outer text you delete the HTML content itself, but if you perform another find on the same elements it will appear in the result. the reason is that the simple HTML DOM object still has it's internal structure of the element, only without its actual content. what you need to do in order to really delete the element is simply reload the HTML as string to the same variable. this way the object will be recreated without the deleted content, and the simple HTML DOM object will be built without it.

当您只删除外部文本时,您将删除 HTML 内容本身,但是如果您对相同元素执行另一个查找,它将出现在结果中。原因是简单的 HTML DOM 对象仍然具有元素的内部结构,只是没有其实际内容。为了真正删除元素,您需要做的只是将 HTML 作为字符串重新加载到同一变量中。这样,对象将在没有删除内容的情况下重新创建,并且将在没有它的情况下构建简单的 HTML DOM 对象。

here is an example function:

这是一个示例函数:

public function removeNode($selector)
{
    foreach ($this->find($selector) as $node)
    {
        $node->outertext = '';
    }

    $this->load($this->save());        
}

put this function inside the simple_html_dom class and you're good.

把这个函数放在 simple_html_dom 类中,你就很好了。

回答by Sid

I think you have some difficulties because you forgot to save(dump the internal DOM tree back into string).

我认为您有一些困难,因为您忘记保存(将内部 DOM 树转储回字符串)。

Try this:

尝试这个:

$html = file_get_html("http://example.com");

foreach($html ->find('img') as $item) {
    $item->outertext = '';
    }

$html->save();

echo $html;

回答by JaseC

I could not figure out where to put the function so I just put the following directly in my code:

我不知道把这个函数放在哪里,所以我直接把以下内容放在我的代码中:

$html->load($html->save());

It basically locks changes made in the for loop back into the html per above.

它基本上将 for 循环中所做的更改锁定回上面的 html。

回答by marcelde

The supposed solutions are quite expensive and practically unusable in a big loop or other kind of repetition.

假设的解决方案非常昂贵,并且在大循环或其他类型的重复中几乎无法使用。

I prefer to use "soft deletes":

我更喜欢使用“软删除”:

foreach($html->find('somecondition'),$item){
    if (somecheck) $item->setAttribute('softDelete', true); //<= set marker to check in further code
    $item->outertext='';


   foreach($foo as $bar){
       if(!baz->getAttribute('softDelete'){
           //do something 
        }
    }

}

回答by baniadams

This is working for me:

这对我有用:

foreach($html->find('element') as $element){
   $element = NULL;
}

回答by Lucas

Adding new answer since removeNodeis definitely a better way of removing it:

添加新答案,因为removeNode绝对是删除它的更好方法:

$html->removeNode('img');

This method probably was not available when accepted answer was marked. You do not need to loop the html to find each one, this will remove them.

标记接受的答案时,此方法可能不可用。您不需要循环 html 来查找每一个,这将删除它们。