java 使用jsoup解析XML——防止jsoup“清理”<link>标签
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/6722307/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Use jsoup to parse XML - prevent jsoup from "cleaning" <link> tags
提问by Ethan
In most case, I have no problem with using jsoup to parse XML. However, if there are <link>
tags in the XML document, jsoup will change <link>some text here</link>
to <link />some text here
. This makes it impossible to extract text inside the <link>
tag using CSS selector.
大多数情况下,我使用jsoup解析XML没有问题。但是,如果有<link>
XML文档中的标签,jsoup将改变<link>some text here</link>
到<link />some text here
。这使得无法<link>
使用 CSS 选择器提取标签内的文本。
So how to prevent jsoup from "cleaning" <link>
tags?
那么如何防止jsoup“清理”<link>
标签呢?
回答by Jonathan Hedley
In jsoup 1.6.2I have added an XML parser mode, which parses the input as-is, without applying the HTML5 parse rules (contents of element, document structure, etc). This mode will keep text in a <link>
tag, and allow multiples of it, etc.
在jsoup 1.6.2 中,我添加了一个 XML 解析器模式,它按原样解析输入,而不应用 HTML5 解析规则(元素的内容、文档结构等)。这种模式将文本保存在一个<link>
标签中,并允许它的倍数等。
Here's an example:
下面是一个例子:
String xml = "<link>One</link><link>Two</link>";
Document xmlDoc = Jsoup.parse(xml, "", Parser.xmlParser());
Elements links = xmlDoc.select("link");
System.out.println("Link text 1: " + links.get(0).text());
System.out.println("Link text 2: " + links.get(1).text());
Returns:
返回:
Link text 1: One
Link text 2: Two
回答by Nowaker
Do not store any text inside <link>
element - it's invalid. If you need extra information, keep it inside HTML5 data-*
attributes. I'm sure jsoup won't touch it.
不要在<link>
元素内存储任何文本- 它是无效的。如果您需要额外的信息,请将其保存在 HTML5data-*
属性中。我敢肯定 jsoup 不会碰它。
<link rel="..." data-city="Warsaw" />
回答by Vinay Lodha
There can be a workaround for this. Before passing XML to jsoup. Transform XML file to replace all with some dummy tag say and do what you want to do.
可以有一个解决方法。在将 XML 传递给 jsoup 之前。转换 XML 文件,用一些虚拟标签替换所有内容,说并做你想做的事。