在 PHP 中从字符串中提取 DOM 元素
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/5126967/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Extract DOM-elements from string, in PHP
提问by user635443
Possible Duplicates:
crawling a html page using php?
Best methods to parse HTML
I have one string-variable in my php-script, that contains html-page. How i can extract DOM-elements from this string?
我的 php 脚本中有一个字符串变量,它包含 html 页面。我如何从这个字符串中提取 DOM 元素?
For example, in this string '<div class="someclass">text</div>'
, i wish get variable 'text'. How i can do this?
例如,在这个字符串中'<div class="someclass">text</div>'
,我希望得到变量“文本”。我怎么能做到这一点?
回答by Pascal MARTIN
You need to use the DOMDocument
class, and, more specifically, its loadHTML
method, to load your HTML string to a DOM object.
您需要使用DOMDocument
该类,更具体地说,使用它的loadHTML
方法将 HTML 字符串加载到 DOM 对象。
For example :
例如 :
$string = <<<HTML
<p>test</p>
<div class="someclass">text</div>
<p>another</p>
HTML;
$dom = new DOMDocument();
$dom->loadHTML($string);
After that, you'll be able to manipulate the DOM, using for instance the DOMXPath
class to do XPath queries on it.
之后,您将能够操作 DOM,例如使用DOMXPath
该类对其进行 XPath 查询。
For example, in your case, you could use something based on this portion of code :
例如,在您的情况下,您可以使用基于这部分代码的内容:
$xpath = new DOMXpath($dom);
$result = $xpath->query('//div[@class="someclass"]');
if ($result->length > 0) {
var_dump($result->item(0)->nodeValue);
}
Which, here, would get you the following output :
在这里,您会得到以下输出:
string 'text' (length=4)
As an alternative, instead of DOMDocument
, you could also use simplexml_load_string
and SimpleXMLElement::xpath
-- but for complex manipulations, I generally prefer using DOMDocument
.
作为替代方案,DOMDocument
您也可以使用simplexml_load_string
and来代替,SimpleXMLElement::xpath
但对于复杂的操作,我通常更喜欢使用DOMDocument
.
回答by Tim Cooper
Have a look at DOMDocument
and DOMXPath
.
$DOM = new DOMDocument();
$DOM->loadHTML($str);
$xpath = new DOMXPath($DOM);
$someclass_elements = $xpath->query('//[@class = "someclass"]');
// ...