使用 PHP 获取 DOM 元素

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/8144061/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-26 04:03:18  来源:igfitidea点击:

Using PHP to get DOM Element

phphtmldomtags

提问by Matt

I'm struggling big time understanding how to use the DOMElement object in PHP. I found this code, but I'm not really sure it's applicable to me:

我正在努力理解如何在 PHP 中使用 DOMElement 对象。我找到了这段代码,但我不确定它是否适用于我:

$dom = new DOMDocument();
$dom->loadHTML("index.php");

$div = $dom->getElementsByTagName('div');
foreach ($div->attributes as $attr) {
     $name = $attr->nodeName;
     $value = $attr->nodeValue;
     echo "Attribute '$name' :: '$value'<br />";
}

Basically what I need is to search the DOM for an elementwith a particular id, after which point I need to extract a non-standard attribute(i.e. one that I made up and put on with JS) so I can see the value of that. The reason is I need one piece from the $_GETand one piece that is in the HTML based from a redirect. If someone could just explain how I use DOMDocument for this purpose, that would be helpful. I'm really struggling understanding what's going on and how to properly implement it, because I clearly am not doing it right.

基本上,我需要的是搜索DOM为element与特定的id,在此之后我需要提取非标准attribute(即一个,我做起来,把与JS),所以我可以看到该值。原因是我需要一个来自$_GET和 来自基于重定向的 HTML 中的一个。如果有人可以解释我如何为此目的使用 DOMDocument,那会很有帮助。我真的很难理解正在发生的事情以及如何正确实施它,因为我显然做得不对。

EDIT (Where I'm at based on comment):

编辑(根据评论我所在的位置):

This is my code lines 4-26 for reference:

这是我的代码第 4-26 行以供参考:

<div id="column_profile">
    <?php
        require_once($_SERVER["DOCUMENT_ROOT"] . "/peripheral/profile.php");            
        $searchResults = isset($_GET["s"]) ? performSearch($_GET["s"]) : "";

        $dom = new DOMDocument();
        $dom->load("index.php");

        $divs = $dom->getElementsByTagName('div');
        foreach ($divs as $div) {
            foreach ($div->attributes as $attr) {
              $name = $attr->nodeName;
              $value = $attr->nodeValue;
              echo "Attribute '$name' :: '$value'<br />";
            }
        }
        $div = $dom->getElementById('currentLocation');
        $attr = $div->getAttribute('srckey');   
        echo "<h1>{$attr}</a>";
    ?>
</div>

<div id="column_main">

Here is the error message I'm getting:

这是我收到的错误消息:

Warning: DOMDocument::load() [domdocument.load]: Extra content at the end of the document in ../public_html/index.php, line: 26 in ../public_html/index.php on line 10

Fatal error: Call to a member function getAttribute() on a non-object in ../public_html/index.php on line 21

回答by Rocket Hazmat

getElementsByTagNamereturns you a list of elements, so first you need to loop through the elements, then through their attributes.

getElementsByTagName返回一个元素列表,所以首先你需要遍历元素,然后遍历它们的属性。

$divs = $dom->getElementsByTagName('div');
foreach ($divs as $div) {
    foreach ($div->attributes as $attr) {
      $name = $attr->nodeName;
      $value = $attr->nodeValue;
      echo "Attribute '$name' :: '$value'<br />";
    }
}

In your case, you said you needed a specific ID. Those are supposed to be unique, so to do that, you can use (note getElementByIdmight not work unless you call $dom->validate()first):

在您的情况下,您说您需要一个特定的 ID。这些应该是独一无二的,所以要做到这一点,你可以使用(注意getElementById可能无法工作,除非你$dom->validate()先打电话):

$div = $dom->getElementById('divID');

Then to get your attribute:

然后获取你的属性:

$attr = $div->getAttribute('customAttr');

EDIT: $dom->loadHTMLjust reads the contents of the file, it doesn't execute them. index.phpwon't be ran this way. You might have to do something like:

编辑$dom->loadHTML只读取文件的内容,不执行它们。 index.php不会以这种方式运行。您可能必须执行以下操作:

$dom->loadHTML(file_get_contents('http://localhost/index.php'))

回答by jakx

You won't have access to the HTML if the redirect is from an external server. Let me put it this way: the DOM does not exist at the point you are trying to parse it. What you can do is pass the text to a DOM parser and then manipulate the elements that way. Or the better way would be to add it as another GET variable.

如果重定向来自外部服务器,您将无法访问 HTML。让我这样说:在您尝试解析它时,DOM 不存在。您可以做的是将文本传递给 DOM 解析器,然后以这种方式操作元素。或者更好的方法是将其添加为另一个 GET 变量。

EDIT: Are you also aware that the client can change the HTML and have it pass whatever they want? (Using a tool like Firebug)

编辑:您是否也知道客户端可以更改 HTML 并让它通过任何他们想要的内容?(使用 Firebug 之类的工具)