使用 PHP 从网页中提取特定数据

Question

提问by Daniel Silva

Possible Duplicate:
HTML Scraping in Php

可能的重复：
PHP 中的 HTML Scraping

I would like to know if is there any way to get from a webpage a specific string of text wich is updated every now and then using PHP. I′ve searched "all over the internet" and have found nothing. Just saw that preg_match could do it, but I didn't understand how to use it.

我想知道是否有任何方法可以从网页中获取特定的文本字符串，该字符串会时不时地使用 PHP 进行更新。我搜索了“整个互联网”，但一无所获。刚刚看到 preg_match 可以做到，但我不明白如何使用它。

imagine that a webpage contains this:

想象一个网页包含以下内容：

<div name="changeable_text">**GET THIS TEXT**</div>

How can I do it using PHP, after having used file_get_contentsto put the page in a variable?

在过去file_get_contents将页面放入变量后，如何使用 PHP来实现？

Thanks in advance :)

提前致谢：）

Answer 1

回答by nickb

You can use DOMDocument, like this:

你可以使用DOMDocument，像这样：

$html = file_get_contents( $url);

libxml_use_internal_errors( true);
$doc = new DOMDocument;
$doc->loadHTML( $html);
$xpath = new DOMXpath( $doc);

// A name attribute on a <div>???
$node = $xpath->query( '//div[@name="changeable_text"]')->item( 0);

echo $node->textContent; // This will print **GET THIS TEXT**

Answer 2

回答by Kai Mattern

You might want to have a look at the

你可能想看看

Simple HTML DOM Library

简单的 HTML DOM 库

There is a little tutorial here: http://www.developertutorials.com/tutorials/php/easy-screen-scraping-in-php-simple-html-dom-library-simplehtmldom-398/

这里有一个小教程：http: //www.developertutorials.com/tutorials/php/easy-screen-scraping-in-php-simple-html-dom-library-simplehtmldom-398/

That one is a screen scraping API that lets you feed html to it and then get parts of it in a jQuery similiar language.

那是一个屏幕抓取 API，可让您将 html 提供给它，然后以 jQuery 类似语言获取其中的一部分。

Answer 3

回答by Celeritas

You're talking about data scraping: the act of extracting data from a human readable output. In your case this is whatever is between the <div>tags. Use PHP DOM's extension to get to the tag you want and extract data. Google search for a PHP DOM tutorial.

你说的是数据抓取：从人类可读的输出中提取数据的行为。在您的情况下，这是<div>标签之间的任何内容。使用PHP DOM的扩展来获取您想要的标签并提取数据。谷歌搜索 PHP DOM 教程。

Answer 4

回答by spiralclick

$delements= file_get_html('url will go here'); 

foreach($elements->find('element') as $ele) {

 ?  //traverse according to your preferences

} 

//return or output

使用 PHP 从网页中提取特定数据

提问by Daniel Silva

回答by nickb

回答by Kai Mattern

回答by Celeritas

回答by spiralclick

相关推荐

最近更新

标签

使用 PHP 从网页中提取特定数据

提问by Daniel Silva

回答by nickb

回答by Kai Mattern

回答by Celeritas

回答by spiralclick

相关推荐

php 如何用php清除浏览器缓存？

php 带有双选择语句的嵌套 mysql 查询？

php 如何检查字符串的字符集？

php 创建 .zip 文件

相关推荐

最近更新

标签