如何使用 PHP Simple HTML DOM Parser 提取标题和元描述?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/11385774/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-25 00:24:12  来源:igfitidea点击:

How to extract title and meta description using PHP Simple HTML DOM Parser?

phphtmlparsingdomsimpledom

提问by Henry The Least

How can I extract a page's titleand meta descriptionusing the PHP Simple HTML DOM Parser?

如何使用PHP Simple HTML DOM Parser提取页面title和元description数据?

I just need the title of the page and the keywords in plain text.

我只需要页面的标题和纯文本的关键字。

采纳答案by Ryan B

I just took a look at the HTML DOM Parser, try:

我只是看了一下 HTML DOM Parser,尝试:

$html = new simple_html_dom();
$html->load_file('xxx'); //put url or filename in place of xxx
$title = $html->find('title');
echo $title->plaintext;

$descr = $html->find('meta[description]');
echo $descr->plaintext;

回答by Faraona

$html = new simple_html_dom();
$html->load_file('some_url'); 

//To get Meta Title
$meta_title = $html->find("meta[name='title']", 0)->content;

//To get Meta Description
$meta_description = $html->find("meta[name='description']", 0)->content;

//To get Meta Keywords
$meta_keywords = $html->find("meta[name='keywords']", 0)->content;

NOTE: The namesof meta tags are casesensitive!

注意:元标记的名称区分大小写!

回答by chuck911

$html = new simple_html_dom();
$html->load_file('http://www.google.com'); 
$title = $html->find('title',0)->innertext;

$html->find('title')will return an array

$html->find('title')将返回一个数组

so you should use $html->find('title',0), so does meta[description]

所以你应该使用$html->find('title',0),元[描述]

回答by Innate

Taken from LeiXC's solution above, you need to use the simple html dom class:

取自上面 LeiXC 的解决方案,您需要使用简单的 html dom 类:

$dom = new simple_html_dom();
$dom->load_file( 'websiteurl.com' );// put your own url in here for testing
$html = str_get_html($dom);
$descr = $html->find("meta[name=description]", 0);
$description = $descr->content;
echo $description;

I have tested this code and yes it is case sensitive (some meta tags use a capital D for description)

我已经测试了这段代码,是的,它区分大小写(某些元标记使用大写 D 进行描述)

Here is some error checking for spelling errors:

以下是拼写错误的一些错误检查:

if( is_object( $html->find("meta[name=description]", 0)) ){
    echo $html->find("meta[name=description]", 0)->content;
} elseif( is_object( $html->find("meta[name=Description]", 0)) ){
    echo $html->find("meta[name=Description]", 0)->content;
}

回答by Алексей Склейнов

$html->find('meta[name=keywords]',0)->attr['content'];
$html->find('meta[name=description]',0)->attr['content'];

回答by liuqing

$html = new simple_html_dom();
$html->load_file('xxx'); 
//put url or filename in place of xxx
$title = array_shift($html->find('title'))->innertext;
echo $title;
$descr = array_shift($html->find("meta[name='description']"))->content;
echo $descr;

回答by Sinatrya Yogi Rizal

you can using php code and so simple to know. like here

您可以使用 php 代码,了解起来很简单。像这儿

$result = 'site.com'; $tags = get_meta_tags("html/".$result);

$result = 'site.com'; $tags = get_meta_tags("html/".$result);

回答by LeviXC

The correct answer is:

正确答案是:

$html = str_get_html($html);
$descr = $html->find("meta[name=description]", 0);
$description = $descr->content;

The above code gets html into an object format, then the find method looks for a meta tag with the name description, and finally you need to return the value of the meta tag's content, not the innertext or plaintext as outlined by others.

上面的代码把html转成对象格式,然后find方法寻找名称为description的meta标签,最后你需要返回meta标签内容的值,而不是其他人概述的内文或明文。

This has been tested and used in live code. Best

这已经在实时代码中进行了测试和使用。最好的事物

回答by hitman47

I found the easy way to take description

我找到了描述的简单方法

$html = new simple_html_dom(); 
$html->load_file('your_url');
$title = $html->load('title')->simpletext; //<title>**Text from here**</title>
$description = $html->load("meta[name='description']", 0)->simpletext; //<meta name="description" content="**Text from here**">

If your line contains extra spaces, then try this

如果你的行包含额外的空格,那么试试这个

$title = trim($title);
$description = trim($description);