php PHP使用DOMDocument从URL检索内部HTML作为字符串
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/10921457/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
PHP retrieve inner HTML as string from URL using DOMDocument
提问by Dan Kanze
I've been picking bits and pieces of code, you can see roughly what I'm trying to do, obviously this doesn't work and is utterly wrong:
我一直在挑选一些零碎的代码,你可以大致看到我想要做什么,显然这不起作用并且完全错误:
<?php
$dom= new DOMDocument();
$dom->loadHTMLFile('http://example.com/');
$data = $dom->getElementById("profile_section_container");
$html = $data->saveHTML();
echo $html;
?>
Using a CURL call, I am able to retrieve the document URL source:
使用 CURL 调用,我能够检索文档 URL 源:
function curl_get_file_contents($URL)
{
$c = curl_init();
curl_setopt($c, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($c, CURLOPT_URL, $URL);
$contents = curl_exec($c);
curl_close($c);
if ($contents) return $contents;
else return FALSE;
}
$f = curl_get_file_contents('http://example.com/');
echo $f;
So how can I use this now to instantiate a DOMDocument object in PHP and extract a node using getElementById
那么我现在如何使用它在 PHP 中实例化一个 DOMDocument 对象并使用getElementById提取一个节点
回答by anubhava
This is the code you will need to avoid any malformed HTML errors:
这是您需要避免任何格式错误的 HTML 错误的代码:
$dom = new DOMDocument();
libxml_use_internal_errors(true);
$dom->loadHTMLFile('http://example.com/');
$data = $dom->getElementById("banner");
echo $data->nodeValue."\n"
To dump whole HTML source you can call:
要转储整个 HTML 源代码,您可以调用:
echo $dom->saveHTML();
回答by Motes
<?php
$f = curl_get_file_contents('http://example.com/')
$dom = new DOMDocument();
@$dom->loadHTML($f);
$data = $dom->getElementById("profile_section_container");
$html = $dom->saveHTML($data);
echo $html;
?>
It would help if you provided the example html.
如果您提供示例 html 会有所帮助。
回答by mimiz
i'm not sure but i remember once i wanted to use this i was unbale to load some external url as file because the php.ini directve allow-url-fopenwas set to off ...
我不确定,但我记得有一次我想使用它时,我无法将一些外部 url 作为文件加载,因为 php.ini 指令allow-url-fopen被设置为关闭...
So check your pnp.ini or try to open url with fopen to see if you can read the url as a file
因此,检查您的 pnp.ini 或尝试使用 fopen 打开 url 以查看您是否可以将 url 作为文件读取
<?php
$f = file_get_contents(url);
var_dump($f); // just to see the content
?>
Regards;
问候;
mimiz
咪咪
回答by mimiz
i think that now you can use DOMDocument::loadHTMLMaybe you should try Doctype existence (with a regexp) and then add it if necessary, for being sure to have it declare ... Regards
我认为现在你可以使用DOMDocument::loadHTML也许你应该尝试 Doctype 存在(使用正则表达式),然后在必要时添加它,以确保它声明......
Mimiz
米米兹
回答by Austin
Try this:
尝试这个:
$dom= new DOMDocument();
$dom->loadHTMLFile('http://example.com/');
$data = $dom->getElementById("profile_section_container")->item(0);
$html = $data->saveHTML();
echo $html;

