php 使用 SimpleXML 读取 RSS 提要
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4887300/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Using SimpleXML to read RSS feed
提问by geoffs3310
I am using PHP and simpleXML to read the following rss feed:
我正在使用 PHP 和 simpleXML 阅读以下 rss 提要:
http://feeds.bbci.co.uk/news/england/rss.xml
I can get most of the information I want like so:
我可以像这样获得我想要的大部分信息:
$rss = simplexml_load_file('http://feeds.bbci.co.uk/news/england/rss.xml');
echo '<h1>'. $rss->channel->title . '</h1>';
foreach ($rss->channel->item as $item) {
echo '<h2><a href="'. $item->link .'">' . $item->title . "</a></h2>";
echo "<p>" . $item->pubDate . "</p>";
echo "<p>" . $item->description . "</p>";
}
But how would I output the thumbnail image that is in the following tag:
但是我将如何输出以下标签中的缩略图:
<media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/51078000/jpg/_51078953_226alanpotbury.jpg"/>
采纳答案by Bj?rn
SimpleXML is pretty bad at handling namespaces. You have two choices: The simplest hack is to simply read the contents of the feed into a string and replace the namespaces;
SimpleXML 在处理命名空间方面非常糟糕。您有两种选择: 最简单的方法是简单地将提要的内容读入一个字符串并替换命名空间;
$feed = file_get_contents('http://feeds.bbci.co.uk/news/england/rss.xml');
$feed = str_replace('<media:', '<', $feed);
$rss = simplexml_load_string($feed);
...
Now you can access the element thumbnail
directly.
现在您可以thumbnail
直接访问该元素。
The more elegant (not really) method is to find out what URI the namespace uses. If you look at the source code for http://feeds.bbci.co.uk/news/england/rss.xmlyou see that it points to http://search.yahoo.com/mrss/
.
更优雅的(不是真的)方法是找出命名空间使用的 URI。如果您查看http://feeds.bbci.co.uk/news/england/rss.xml的源代码,您会看到它指向http://search.yahoo.com/mrss/
.
Now you can use this URI in the children()
method of a SimpleXMLElement to get the contents of the media:thumbnail element;
现在您可以在children()
SimpleXMLElement的方法中使用此 URI来获取 media:thumbnail 元素的内容;
$rss = simplexml_load_file('http://feeds.bbci.co.uk/news/england/rss.xml');
foreach ($rss->channel->item as $item) {
$media = $item->children('http://search.yahoo.com/mrss/');
...
}
回答by Josh Davis
As you already know, SimpleXML lets you select an node's child using the object property operator ->
or a node's attribute using the array access ['name']
. It's great, but the operation only works if what you select belongs to the same namespace.
如您所知,SimpleXML 允许您使用对象属性运算符选择节点的子->
节点,或使用数组访问选择节点的属性['name']
。这很好,但只有当您选择的内容属于同一个 namespace 时,该操作才有效。
If you want to "hop"from a namespace to another, you can use the children()
or attributes()
methods. In your case, this is made a bit trickier because you have <item/>
in the global namespace, the node you're looking for is in the "media" namespace* and then the attributes are in the global namespace again (they are not prefixed.) So using the normal object/array notation you'll have to "hop"twice:
如果你想从一个命名空间“跳跃”到另一个命名空间,你可以使用children()
orattributes()
方法。在您的情况下,这有点棘手,因为您<item/>
在全局命名空间中,您要查找的节点位于“媒体”命名空间*中,然后属性再次位于全局命名空间中(它们没有前缀。)因此,使用普通的对象/数组表示法,您必须“跳跃”两次:
foreach ($rss->channel->item as $item)
{
// we load the attributes into $thumbAttr
// you can either use the namespace prefix
$thumbAttr = $item->children('media', true)->thumbnail->attributes();
// or preferably the namespace name, read note below for an explanation
$thumbAttr = $item->children('http://search.yahoo.com/mrss/')->thumbnail->attributes();
echo $thumbAttr['url'];
}
*Note
*笔记
I refer to the namespace as the "media" namespace but that's not really correct. The namespace name is http://search.yahoo.com/mrss/
, and "media" is just a prefix, some sort of alias if you will. What's important to keep in mind is that http://search.yahoo.com/mrss/
is the real name of the namespace. At some point, your RSS provider might decide to change the prefix to, say, "yahoo" and your script will stop working if your script refers to the "media" prefix. However, if you use the namespace name, it will keep working no matter the prefix.
我将命名空间称为“媒体”命名空间,但这并不正确。命名空间名称是http://search.yahoo.com/mrss/
,而“media”只是一个前缀,如果您愿意,可以使用某种别名。重要的是要记住,这http://search.yahoo.com/mrss/
是命名空间的真实名称。在某些时候,您的 RSS 提供者可能会决定将前缀更改为“雅虎”,如果您的脚本引用“媒体”前缀,您的脚本将停止工作。但是,如果您使用命名空间名称,无论前缀如何,它都会继续工作。