php 使用 CURL 从外部网页中选择特定的 div

Question

提问by Paul

Hi can anyone help me how to select a specific div from the content of a webpage.

嗨，任何人都可以帮助我如何从网页内容中选择特定的 div。

Let's say i want to get the div with id="wrapper_content"from webpage http://www.test.com/page3.php.

假设我想id="wrapper_content"从网页中获取 div http://www.test.com/page3.php。

My current code looks something like this: (not working)

我当前的代码看起来像这样：（不工作）

//REG EXP.
$s_searchFor = '@^/.dont know what to put here..@ui';    

//CURL
$ch = curl_init();
$timeout = 5; // set to zero for no timeout
curl_setopt ($ch, CURLOPT_URL, 'http://www.test.com/page3.php');
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
if(!preg_match($s_searchFor, $ch))
{
  $file_contents = curl_exec($ch);
}
curl_close($ch);

// display file
echo $file_contents;

So i'd like to know how i can use reg expressions to find a specific div and how to unsetthe rest of the webpage so that $file_contentonly contains the div.

所以我想知道如何使用 reg 表达式来查找特定的 div 以及如何取消设置网页的其余部分，以便$file_content只包含 div。

Answer 1

回答by Yacoby

HTML isn't regular, so you shouldn't use regex. Instead I would recommend a HTML Parser such as Simple HTML DOMor DOM

HTML 不是常规的，所以你不应该使用正则表达式。相反，我会推荐一个 HTML 解析器，例如Simple HTML DOM或DOM

If you were going to use Simple HTML DOM you would do something like the following:

如果您打算使用简单的 HTML DOM，您将执行以下操作：

$html = str_get_html($file_contents);
$elem = $html->find('div[id=wrapper_content]', 0);

Even if you used regex your code still wouldn't work correctly. You need to get the contents of the page before you can use regex.

即使您使用了正则表达式，您的代码仍然无法正常工作。您需要先获取页面的内容，然后才能使用正则表达式。

//wrong
if(!preg_match($s_searchFor, $ch)){
    $file_contents = curl_exec($ch);
}

//right
$file_contents = curl_exec($ch); //get the page contents
preg_match($s_searchFor, $file_contents, $matches); //match the element
$file_contents = $matches[0]; //set the file_contents var to the matched elements

Answer 2

回答by Amit Garg

include('simple_html_dom.php');
$html = str_get_html($file_contents);
$elem = $html->find('div[id=wrapper_content]', 0);

Download simple_html_dom.php

下载simple_html_dom.php

Answer 3

回答by imightbeinatree at Cloudspace

check our hpricot, it lets you elegantly select sections

检查我们的 hpricot，它可以让您优雅地选择部分

first you would use curl to get the document, then use hpricot to get the part you need

首先你会使用 curl 来获取文档，然后使用 hpricot 来获取你需要的部分

php 使用 CURL 从外部网页中选择特定的 div

提问by Paul

回答by Yacoby

回答by Amit Garg

回答by imightbeinatree at Cloudspace

相关推荐

最近更新

标签

php 使用 CURL 从外部网页中选择特定的 div

提问by Paul

回答by Yacoby

回答by Amit Garg

回答by imightbeinatree at Cloudspace

相关推荐

php 如何检查字符是字母还是数字？

php 将字符串转换为双精度值 - 这可能吗？

php 如何摆脱“未捕获的 SoapFault 异常：[客户端] 看起来我们没有 XML 文档......”错误

php 在 Codeigniter 中使用 LIMIT 进行选择

相关推荐

最近更新

标签