如何在 PHP 中获取网页的 HTML 代码？

Question

提问by Prashant

I want to retrieve the HTML code of a link (web page) in PHP. For example, if the link is

我想在 PHP 中检索链接（网页）的 HTML 代码。例如，如果链接是

then I want the HTML code of the page which is served. I want to retrieve this HTML code and store it in a PHP variable.

然后我想要提供的页面的 HTML 代码。我想检索此 HTML 代码并将其存储在 PHP 变量中。

How can I do this?

我怎样才能做到这一点？

Answer 1

回答by Greg

If your PHP server allows url fopen wrappers then the simplest way is:

如果您的 PHP 服务器允许 url fopen 包装器，那么最简单的方法是：

$html = file_get_contents('https://stackoverflow.com/questions/ask');

If you need more control then you should look at the cURLfunctions:

如果您需要更多控制，那么您应该查看cURL函数：

$c = curl_init('https://stackoverflow.com/questions/ask');
curl_setopt($c, CURLOPT_RETURNTRANSFER, true);
//curl_setopt(... other options you want...)

$html = curl_exec($c);

if (curl_error($c))
    die(curl_error($c));

// Get the status code
$status = curl_getinfo($c, CURLINFO_HTTP_CODE);

curl_close($c);

Answer 2

回答by Dmitri Pisarev

Also if you want to manipulate the retrieved page somehow, you might want to try some php DOM parser. I find PHP Simple HTML DOM Parservery easy to use.

此外，如果您想以某种方式操作检索到的页面，您可能想尝试一些 php DOM 解析器。我发现PHP Simple HTML DOM Parser非常易于使用。

Answer 3

回答by Ickmund

You may want to check out the YQL libraries from Yahoo: http://developer.yahoo.com/yql

您可能想查看 Yahoo 的 YQL 库：http: //developer.yahoo.com/yql

The task at hand is as simple as

手头的任务很简单

select * from html where url = 'http://stackoverflow.com/questions/ask'

You can try this out in the console at: http://developer.yahoo.com/yql/console(requires login)

您可以在控制台中尝试此操作：http: //developer.yahoo.com/yql/console（需要登录）

Also see Chris Heilmanns screencast for some nice ideas what more you can do: http://developer.yahoo.net/blogs/theater/archives/2009/04/screencast_collating_distributed_information.html

另请参阅 Chris Heilmanns 截屏视频，了解您还可以做什么：http: //developer.yahoo.net/blogs/theater/archives/2009/04/screencast_collating_distributed_information.html

Answer 4

回答by Stefan Gehrig

Simple way:Use file_get_contents():

简单的方法：使用file_get_contents()：

$page = file_get_contents('http://stackoverflow.com/questions/ask');

Please note that allow_url_fopenmust be truein you php.inito be able to use URL-aware fopen wrappers.

请注意，allow_url_fopen一定要true在你php.ini能够使用URL的fopen封装。

More advanced way:If you cannot change your PHP configuration, allow_url_fopenis falseby default and if ext/curl is installed, use the cURLlibraryto connect to the desired page.

更先进的方式：如果你不能改变你的PHP配置，allow_url_fopen是false在默认情况下，如果安装了分机/卷曲，使用cURL库连接到所需的页面。

Answer 5

回答by piglot

you could use file_get_contents if you are wanting to store the source as a variable however curl is a better practive.

如果您想将源存储为变量，则可以使用 file_get_contents，但 curl 是更好的做法。

$url = file_get_contents('http://example.com');
echo $url;

this solution will display the webpage on your site. However curl is a better option.

此解决方案将在您的网站上显示网页。但是 curl 是更好的选择。

Answer 6

回答by sarath

include_once('simple_html_dom.php');
$url="http://stackoverflow.com/questions/ask";
$html = file_get_html($url);

You can get the whole HTML code as an array (parsed form) using this code Download the 'simple_html_dom.php' file here http://sourceforge.net/projects/simplehtmldom/files/simple_html_dom.php/download

您可以使用此代码将整个 HTML 代码作为数组（解析形式）在此处下载“simple_html_dom.php”文件 http://sourceforge.net/projects/simplehtmldom/files/simple_html_dom.php/download

Answer 7

回答by Sergei

look at this function:

看看这个函数：

http://ru.php.net/manual/en/function.file-get-contents.php

Answer 8

回答by T.Todua

Here is two different, simple ways to get content from URL:

这是从 URL 获取内容的两种不同的简单方法：

1) the first method

1）第一种方法

Enable Allow_url_include from your hosting (php.ini or somewhere)

从您的主机（php.ini 或其他地方）启用 Allow_url_include

<?php
$variableee = readfile("http://example.com/");
echo $variableee;
?>

or

或者

2)the second method

2）第二种方法

Enable php_curl, php_imap and php_openssl

启用 php_curl、php_imap 和 php_openssl

<?php
// you can add anoother curl options too
// see here - http://php.net/manual/en/function.curl-setopt.php
function get_dataa($url) {
  $ch = curl_init();
  $timeout = 5;
  curl_setopt($ch, CURLOPT_URL, $url);
  curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)");
  curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
  curl_setopt($ch, CURLOPT_SSL_VERIFYHOST,false);
  curl_setopt($ch, CURLOPT_SSL_VERIFYPEER,false);
  curl_setopt($ch, CURLOPT_MAXREDIRS, 10);
  curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
  curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
  $data = curl_exec($ch);
  curl_close($ch);
  return $data;
}

$variableee = get_dataa('http://example.com');
echo $variableee;
?>

Answer 9

回答by Krishnamoorthy Acharya

you can use the DomDocument method to get an individual HTML tag level variable too

您也可以使用 DomDocument 方法来获取单个 HTML 标记级别的变量

$homepage = file_get_contents('https://www.example.com/');
$doc = new DOMDocument;
$doc->loadHTML($homepage);
$titles = $doc->getElementsByTagName('h3');
echo $titles->item(0)->nodeValue;

Answer 10

回答by Ken

$output = file("http://www.example.com");didn't work until I enabled: allow_url_fopen, allow_url_include,and file_uploadsin php.inifor PHP7

$output = file("http://www.example.com");没有工作，直到我启用了：allow_url_fopen, allow_url_include,和file_uploads在php.ini对PHP7

如何在 PHP 中获取网页的 HTML 代码？

提问by Prashant

回答by Greg

回答by Dmitri Pisarev

回答by Ickmund

回答by Stefan Gehrig

回答by piglot

回答by sarath

回答by Sergei

回答by T.Todua

回答by Krishnamoorthy Acharya

回答by Ken

相关推荐

最近更新

标签

如何在 PHP 中获取网页的 HTML 代码？

提问by Prashant

回答by Greg

回答by Dmitri Pisarev

回答by Ickmund

回答by Stefan Gehrig

回答by piglot

回答by sarath

回答by Sergei

回答by T.Todua

回答by Krishnamoorthy Acharya

回答by Ken

相关推荐

通过字符串获取 PHP 类属性

php 如何使用 preg_replace_callback？

php 将 Mysqli bind_param 与日期和时间列一起使用？

如何正确使用连接/子查询从多个表中选择数据？(PHP-MySQL)

相关推荐

最近更新

标签