Java/DOM:获取节点的 XML 内容
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/484995/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Java/DOM: Get the XML content of a node
提问by
I am parsing a XML file in Java using the W3C DOM. I am stuck at a specific problem, I can't figure out how to get the whole inner XML of a node.
我正在使用 W3C DOM 解析 Java 中的 XML 文件。我被困在一个特定的问题上,我不知道如何获取节点的整个内部 XML。
The node looks like that:
该节点如下所示:
<td><b>this</b> is a <b>test</b></td>What function do I have to use to get that:
我必须使用什么功能来获得它:
"<b>this</b> is a <b>test</b>"采纳答案by Pierre
You have to use the transform/xslt API using your <b> node as the node to be transformed and put the result into a new StreamResult(new StringWriter()); . Seehow-to-pretty-print-xml-from-java
您必须使用 <b> 节点作为要转换的节点来使用转换/xslt API,并将结果放入新的 StreamResult(new StringWriter()); . 请参阅how-to-pretty-print-xml-from-java
回答by Joel P.
I know this was asked long ago but for the next person searching (was me today), this works with JDOM:
我知道很久以前就有人问过这个问题,但是对于下一个搜索的人(今天是我),这适用于 JDOM:
JDOMXPath xpath = new JDOMXPath("/td");
String innerXml = (new XMLOutputter()).outputString(xpath.selectNodes(document));
This passes a list of all child nodes into outputString, which will serialize them out in order.
这会将所有子节点的列表传递到 outputString 中,这将按顺序将它们序列化。
回答by Kry?tof Hilar
What do you say about this ? I had same problem today on android, but i managed to make simple "serializator"
你对此有什么看法?我今天在 android 上遇到了同样的问题,但我设法制作了简单的“序列化器”
private String innerXml(Node node){
String s = "";
NodeList childs = node.getChildNodes();
for( int i = 0;i<childs.getLength();i++ ){
s+= serializeNode(childs.item(i));
}
return s;
}
private String serializeNode(Node node){
String s = "";
if( node.getNodeName().equals("#text") ) return node.getTextContent();
s+= "<" + node.getNodeName()+" ";
NamedNodeMap attributes = node.getAttributes();
if( attributes!= null ){
for( int i = 0;i<attributes.getLength();i++ ){
s+=attributes.item(i).getNodeName()+"=\""+attributes.item(i).getNodeValue()+"\"";
}
}
NodeList childs = node.getChildNodes();
if( childs == null || childs.getLength() == 0 ){
s+= "/>";
return s;
}
s+=">";
for( int i = 0;i<childs.getLength();i++ )
s+=serializeNode(childs.item(i));
s+= "</"+node.getNodeName()+">";
return s;
}
回答by Jason S
er... you could also call toString() and just chop off the beginning and end tags, either manually or using regexps.
呃...你也可以调用 toString() 并手动或使用正则表达式去掉开始和结束标签。
edit: toString() doesn't do what I expected. Pulling out the O'Reilly Java & XML booktalks about the Load and Save module of Java DOM.
编辑: toString() 不符合我的预期。翻出 O'Reilly Java & XML 一书,讨论了 Java DOM 的加载和保存模块。
See in particular the LSSerializerwhich looks very promising. You could either call writeToString(node) and chop off the beginning and end tags, as I suggested, or try to use LSSerializerFilterto not print the top node tags (not sure if that would work; I admit I've never used LSSerializer before.)
特别是LSSerializer,它看起来非常有前途。您可以按照我的建议调用 writeToString(node) 并切掉开始和结束标签,或者尝试使用LSSerializerFilter来不打印顶部节点标签(不确定这是否可行;我承认我以前从未使用过 LSSerializer .)
Reading the O'Reilly book seems to indicate doing something like this:
阅读 O'Reilly 的书似乎表明做这样的事情:
DOMImplementationRegistry registry = DOMImplementationRegistry.newInstance();
DOMImplementationLS lsImpl =
(DOMImplementationLS)registry.getDOMImplementation("LS");
LSSerializer serializer = lsImpl.createLSSerializer();
String nodeString = serializer.writeToString(node);
回答by Jason S
node.getTextContent();
node.getTextContent();
You ought to be using JDom of Dom4J to handle nodes, if for no other reasons, to handle whitespace correctly.
如果没有其他原因,您应该使用 Dom4J 的 JDom 来处理节点,以正确处理空格。
回答by javapowered
To remove unneccesary tags probably such code can be used:
要删除不必要的标签,可能可以使用这样的代码:
DOMConfiguration config = serializer.getDomConfig(); config.setParameter("canonical-form", true);
DOMConfiguration config = serializer.getDomConfig(); config.setParameter("规范形式", true);
But it will not always work, because "canonical-form=true" is optional
但它并不总是有效,因为“canonical-form=true”是可选的

