在 PHP 中从字符串中提取 DOM 元素

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/5126967/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-25 16:52:47  来源:igfitidea点击:

Extract DOM-elements from string, in PHP

phphtmlstringdomdocument

提问by user635443

Possible Duplicates:
crawling a html page using php?
Best methods to parse HTML

可能的重复:
使用 php 抓取 html 页面?
解析 HTML 的最佳方法

I have one string-variable in my php-script, that contains html-page. How i can extract DOM-elements from this string?

我的 php 脚本中有一个字符串变量,它包含 html 页面。我如何从这个字符串中提取 DOM 元素?

For example, in this string '<div class="someclass">text</div>', i wish get variable 'text'. How i can do this?

例如,在这个字符串中'<div class="someclass">text</div>',我希望得到变量“文本”。我怎么能做到这一点?

回答by Pascal MARTIN

You need to use the DOMDocumentclass, and, more specifically, its loadHTMLmethod, to load your HTML string to a DOM object.

您需要使用DOMDocument该类,更具体地说,使用它的loadHTML方法将 HTML 字符串加载到 DOM 对象。

For example :

例如 :

$string = <<<HTML
<p>test</p>
<div class="someclass">text</div>
<p>another</p>
HTML;

$dom = new DOMDocument();
$dom->loadHTML($string);


After that, you'll be able to manipulate the DOM, using for instance the DOMXPathclass to do XPath queries on it.


之后,您将能够操作 DOM,例如使用DOMXPath该类对其进行 XPath 查询。

For example, in your case, you could use something based on this portion of code :

例如,在您的情况下,您可以使用基于这部分代码的内容:

$xpath = new DOMXpath($dom);
$result = $xpath->query('//div[@class="someclass"]');
if ($result->length > 0) {
    var_dump($result->item(0)->nodeValue);
}

Which, here, would get you the following output :

在这里,您会得到以下输出:

string 'text' (length=4)


As an alternative, instead of DOMDocument, you could also use simplexml_load_stringand SimpleXMLElement::xpath-- but for complex manipulations, I generally prefer using DOMDocument.


作为替代方案,DOMDocument您也可以使用simplexml_load_stringand来代替,SimpleXMLElement::xpath但对于复杂的操作,我通常更喜欢使用DOMDocument.

回答by Tim Cooper

Have a look at DOMDocumentand DOMXPath.

看看DOMDocumentDOMXPath

$DOM = new DOMDocument();
$DOM->loadHTML($str);

$xpath = new DOMXPath($DOM);
$someclass_elements = $xpath->query('//[@class = "someclass"]');
// ...