通过 PHP 获取网站内容

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/5608644/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-25 21:58:23  来源:igfitidea点击:

Getting content of a website via PHP

php

提问by user663049

How do I get the content of a page via PHP? How do i grab the text of a blog post because most RRS feed only give the link to the article so i cant use that. Is there a PHP function for this or anyway about doing this. Please offer some suggestions :).

如何通过 PHP 获取页面的内容?我如何获取博客文章的文本,因为大多数 RRS 提要只提供指向文章的链接,因此我无法使用它。是否有用于此的 PHP 函数或无论如何要执行此操作。请提供一些建议:)。

回答by Eric

To just load a page, HTML and all, you can use fopen on the web address:

要加载页面、HTML 和所有内容,您可以在网址上使用 fopen:

$page = file_get_contents('http://www.blog.com/one-example-post');

For more advanced handling of web pages, the cURL library will interact more cleverly with the remote server (for example, if there is HTTP authentication, or it's an https page).

对于更高级的网页处理,cURL 库将与远程服务器进行更巧妙的交互(例如,如果有 HTTP 身份验证,或者它是一个 https 页面)。

Once you have the contents of the page, though, you're probably going to need to do some screen scraping(aka web scraping)... and you're in luck, since I just did this for another project. Here's a great librarythat I uncovered to help with this down-and-dirty technique. Good luck.

但是,一旦您获得了页面的内容,您可能需要进行一些屏幕抓取(又名网页抓取)……而且您很幸运,因为我刚刚为另一个项目做过这件事。是我发现的一个很棒的库,可以帮助处理这种肮脏的技术。祝你好运。

回答by StackOverflowNewbie

cURL is an option, especially if you need your application to behave like a browser (e.g. set a USER AGENT, etc.). You can also use [file_get_contents](see: http://php.net/manual/en/function.file-get-contents.php) which is good enough for simple applications.

cURL 是一个选项,特别是如果您需要您的应用程序像浏览器一样运行(例如设置用户代理等)。您还可以使用[file_get_contents](参见:http: //php.net/manual/en/function.file-get-contents.php),这对于简单的应用程序来说已经足够了。