用 PHP 解析 XML CDATA
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1246732/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Parsing XML CDATA with PHP
提问by Helen Neely
I have a little problem that I can't figure out how to solve. I have an XML (actually it's RSS) file that I'm trying to parse with PHP, but the CDATA tag come out blank.
我有一个小问题,我不知道如何解决。我有一个 XML(实际上它是 RSS)文件,我试图用 PHP 解析它,但是 CDATA 标记出来是空白的。
Here's the XMLCode and here's the PHP file
Everything works fine, except that the description tag is not printing. I would be very grateful if some one could help.
一切正常,除了描述标签没有打印。如果有人可以提供帮助,我将不胜感激。
回答by Pascal MARTIN
Just out of curiosity, after getting your XML (I hope I didnt't destroy it in the process -- I'll see if I can edit the OP to correct it):
出于好奇,在获取您的 XML 之后(我希望我没有在此过程中破坏它——我会看看我是否可以编辑 OP 来纠正它):
- did you cast the description to a string ?
- 您是否将描述转换为字符串?
What I mean is you could use this :
我的意思是你可以使用这个:
$xml = simplexml_load_string($str);
foreach ($xml->channel->item as $item) {
var_dump($item->description);
}
But it would only get you that :
但它只会让你知道:
object(SimpleXMLElement)[5]
object(SimpleXMLElement)[3]
Which is not that nice...
这不是那么好...
You need to cast the data to string, like this :
您需要将数据转换为字符串,如下所示:
$xml = simplexml_load_string($str);
foreach ($xml->channel->item as $item) {
var_dump((string)$item->description);
}
And you get the descriptions :
你会得到描述:
string '
This is one of the content that I need printed on the screen, but nothing is happening. Please, please...output something... <br /><br /> <b>Showing</b>: 2 weeks<br /> <b>Starting On</b>: August 7, 2009 <br /> <b>Posted On</b>: August 7, 2009 <br />
<a href="http://www.mysite.com">click to view</a>
' (length=329)
string '
Another content...This is another of the content that I need printed on the screen, but nothing is happening. Please, please...output something... <br /><br /> <b>Showing</b>: 2 weeks<br /> Starting On: August 7, 2009 <br /> <b>Posted On</b>: August 7, 2009
;
' (length=303)
(Using trimon those might prove useful, btw, if you XML is indented)
(trim顺便说一句,如果您的 XML 是缩进的,那么使用它们可能会很有用)
Else... Well, we'll probably need your php code (at least, would be useful to know how you are getting to the descriptiontag ;-))
否则……好吧,我们可能需要您的 php 代码(至少,了解您如何访问description标签会很有用;-))
EDIT
编辑
Thanks for the reformated XML !
感谢重新格式化的 XML!
If I go to pastebin, in the textarea at the bottom of the page, there is a white space at the beginning of the XML, before the <?xml version="1.0" encoding="utf-8"?>
如果我去pastebin,在页面底部的textarea中,在XML的开头有一个空格,在 <?xml version="1.0" encoding="utf-8"?>
If you have that one in your real XML data, it will be a source of problem : it is not valid XMl (the XML declaration has to be the firstthing in the XML data).
You'll get errors like this one :
如果您在真正的 XML 数据中有那个,那将是一个问题的根源:它不是有效的 XMl(XML 声明必须是XML 数据中的第一件事)。
你会得到这样的错误:
Warning: simplexml_load_string() [function.simplexml-load-string]: Entity: line 1: parser error : XML declaration allowed only at the start of the document
Can you check that ?
And, if the problem is here, you should activate error_reportingand display_errors;-) That would help !
你能检查一下吗?
而且,如果问题出在这里,您应该激活error_reporting并display_errors;-) 那会有所帮助!
EDIT after taking a look at the PHP file :
查看 PHP 文件后编辑:
In your for loop, you are doing this to get your description data :
在您的 for 循环中,您这样做是为了获取您的描述数据:
$item_desc = $x->item($i)->getElementsByTagName('description')->item(0)->childNodes->item(0)->nodeValue;
description doesn't contain any childNode, I'd say ; what about using it's nodeValue directly ?
Like this :
描述不包含任何 childNode,我会说;直接使用它的 nodeValue 怎么样?
像这样 :
$item_desc = $x->item($i)->getElementsByTagName('description')->item(0)->nodeValue;
It seems to be working better this way :-)
这种方式似乎效果更好:-)
As a sidenote, you could probably do the same for other tags, I suppose ; for instance, this seems to be working too :
作为旁注,我想您可能可以对其他标签做同样的事情;例如,这似乎也有效:
$item_title=$x->item($i)->getElementsByTagName('title')->item(0)->nodeValue;
$item_link=$x->item($i)->getElementsByTagName('link')->item(0)->nodeValue;
What does this give you ?
这给你什么?
Another EDIT : and here is the code I would probably use :
另一个编辑:这是我可能会使用的代码:
$xmlDoc = new DOMDocument();
$xmlDoc->loadXML($str); // I changed that because I have the XML data in a string
//get elements from "<channel>"
$channel = $xmlDoc->getElementsByTagName('channel')->item(0);
$channel_title = $channel->getElementsByTagName('title')->item(0)->nodeValue;
$channel_link = $channel->getElementsByTagName('link')->item(0)->nodeValue;
$channel_desc = $channel->getElementsByTagName('description')->item(0)->nodeValue;
//output elements from "<channel>"
echo "<p><a href='" . $channel_link . "'>" . $channel_title . "</a>";
echo "<br />";
echo $channel_desc . "</p>";
//get and output "<item>" elements
$x = $xmlDoc->getElementsByTagName('item');
for ($i=0 ; $i<=1 ; $i++) {
$item_title = $x->item($i)->getElementsByTagName('title')->item(0)->nodeValue;
$item_link = $x->item($i)->getElementsByTagName('link')->item(0)->nodeValue;
$item_desc = $x->item($i)->getElementsByTagName('description')->item(0)->nodeValue;
echo ("<p><a href='" . $item_link
. "'>" . $item_title . "</a>");
echo ("<br />");
echo ($item_desc . "</p>");
echo' <p />';
}
Note I have the XML data in a string, and I don't need to fetch it from an URL, so I'm using the loadXMLmethod and not load.
注意我有一个字符串中的 XML 数据,我不需要从 URL 中获取它,所以我使用的是loadXML方法而不是load.
The major difference is that I removed some childNodes accesses, that I feel were not necessary.
Does this seem OK to you ?
主要区别在于我删除了一些我认为没有必要的 childNodes 访问。
你觉得这样好吗?

