php 从外部网站获取 DIV 内容

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/20446598/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-25 03:12:11  来源:igfitidea点击:

Get DIV content from external Website

phphtmldomdocument

提问by Kallewallex

I want to get a DIV from an external website with pure PHP.

我想使用纯 PHP 从外部网站获取 DIV。

External website: http://www.isitdownrightnow.com/youtube.com.html

外部网站:http: //www.isitdownrightnow.com/youtube.com.html

Div text I want from isitdownrightnow (statusup div): <div class="statusup">The website is probably down just for you...</div>

我想要从 isitdownrightnow (statusup div) 的 Div 文本: <div class="statusup">The website is probably down just for you...</div>

I already tried file_get_contentswith DOMDocumentand str_get_html, but I could not get it to work.

我已经尝试file_get_contentsDOMDocumentand str_get_html,但我无法让它工作。

For example this

例如这个

$page = file_get_contents('http://css-tricks.com/forums/topic/jquery-selector-div-variable/');
    $doc = new DOMDocument();
    $doc->loadHTML($page);
    $divs = $doc->getElementsByTagName('div');
    foreach($divs as $div) {
        // Loop through the DIVs looking for one withan id of "content"
        // Then echo out its contents (pardon the pun)
        if ($div->getAttribute('class') === 'bbp-template-notice') {
             echo $div->nodeValue;
        }
    }

It will just display an error in the console:

它只会在控制台中显示一个错误:

Failed to load resource: the server responded with a status of 500 (Internal Server Error)

加载资源失败:服务器响应状态为 500(内部服务器错误)

回答by FlyingLemon

This is what I always use:

这是我经常使用的:

$url = 'https://somedomain.com/somesite/';
$content = file_get_contents($url);
$first_step = explode( '<div id="thediv">' , $content );
$second_step = explode("</div>" , $first_step[1] );

echo $second_step[0];

回答by worenga

This may be a little overkill, but you'll get the gist.

这可能有点矫枉过正,但你会明白要点。

<?php 

$doc = new DOMDocument;

// We don't want to bother with white spaces
$doc->preserveWhiteSpace = false;

// Most HTML Developers are chimps and produce invalid markup...
$doc->strictErrorChecking = false;
$doc->recover = true;

$doc->loadHTMLFile('http://www.isitdownrightnow.com/check.php?domain=youtube.com');

$xpath = new DOMXPath($doc);

$query = "//div[@class='statusup']";

$entries = $xpath->query($query);
var_dump($entries->item(0)->textContent);

?>

回答by Boyan Alexiev

I used the xpath method proposed by @mightyuhu and it worked great with his addition of the assignment. Depending on the web page you get the info from and the availability of an 'id' or 'class' which identifies the tag you wish to get, you will have to change the query you use. If the tag has an 'id' assigned to it, you can use this (the sample is for extracting the USD exchange rate):

我使用了@mightyuhu 提出的 xpath 方法,并且在他添加作业时效果很好。根据您从中获取信息的网页以及标识您希望获取的标签的“id”或“class”的可用性,您将不得不更改您使用的查询。如果标签分配有“id”,您可以使用它(示例用于提取美元汇率):

$query = "//div[@id='USD']";

$query = "//div[@id='USD']";

However, the site developers won't make it so easy for us, so there will be several more 'unnamed' tags to dig into, in my example:

但是,站点开发人员不会让我们这么容易,因此在我的示例中,将有更多“未命名”标签可供挖掘:

<div id="USD" class="tab">
  <table cellspacing="0" cellpadding="0">
    <tbody>
     <tr>
        <td>Ask Rate</td>
        <td align="right">1.77400</td>
     </tr>
     <tr class="even">
        <td>Bid Rate</td>
        <td align="right">1.70370</td>
     </tr>
     <tr>
        <td>BNB Fixing</td>
        <td align="right">1.735740</td>
     </tr>
   </tbody>
  </table>
</div>

So I had to change the query to get the 'Ask Rate':

所以我不得不更改查询以获取“要价”:

$doc->loadHTMLFile('http://www.fibank.bg/en');
$xpath = new DOMXPath($doc);
$query = "//div[@id='USD']/table/tbody/tr/td";

So, I used the query above, but changed the itemto 1 instead of 0 to get the second column where the exchange rate is (the first column contains the text 'Ask Rate'):

因此,我使用了上面的查询,但将项目更改为 1 而不是 0 以获取汇率所在的第二列(第一列包含文本“要价”):

$entries = $xpath->query($query);
$usdrate = $entries->item(1)->textContent;

Another method is to reference the value directly within the query, which when you don't have names or styles should be done with indexing the tags, which was something I received as knowledge from my Maxthon browser and its "Inspect element' feature combined with the "Copy XPath" right menu option (neat, yeah?):

另一种方法是直接在查询中引用值,当您没有名称或样式时,应通过索引标签来完成,这是我从傲游浏览器及其“检查元素”功能与“复制 XPath”右侧菜单选项(整洁,是吗?):

"//*[@id="USD"]/table/tbody/tr[1]/td[2]"

Notice it also inserts an asterisk (*) after the //, which I have not digged into. In this case you should again get the value with item(0), since there will be no other values.

请注意,它还在 之后插入了一个星号 (*) //,我没有深入研究过。在这种情况下,您应该再次使用 获取值item(0),因为不会有其他值。

If you need, you can make any changes to the string you extracted, for example changing the number format to match your preference:

如果需要,您可以对提取的字符串进行任何更改,例如更改数字格式以匹配您的偏好:

$usdrate = number_format($usdrate, 5, ',', ' ');

I hope someone will find this helpful, as I found the answers above, and will spare this someone time in searching for the correct query and syntax.

我希望有人会发现这很有帮助,因为我在上面找到了答案,并且会花时间搜索正确的查询和语法。

回答by rachid kily

$contents = file_get_contents($url); 

  $title = explode('<div class="entry-content">',$contents); 
  $title = explode("</div>",$title[1]); 

$fp = fopen ("s.php", "w+"); 
fwrite ($fp, "$title[0]"); 
fclose ($fp); 
require_once('s.php');