java 以递归方式从 XML 中删除空节点
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/12524727/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Remove empty nodes from a XML recursively
提问by Dheeraj Joshi
I want to delete the empty nodes from an XML element. This xml is generated from a vendor and i dont have control on xml generation. But since the XML has few empty nodes i need to delete those empty nodes recursively.
我想从 XML 元素中删除空节点。此 xml 是从供应商生成的,我无法控制 xml 生成。但是由于 XML 几乎没有空节点,我需要递归地删除这些空节点。
This xml is got from OMElement and i get an Element from this object using [XMLUtils][1] Sample XML
这个 xml 是从 OMElement 得到的,我使用 [XMLUtils][1] Sample XML 从这个对象得到一个 Element
<A>
<B>
<C>
<C1>
<C11>something</C11>
<C12>something</C12>
</C1>
</C>
<D>
<D1>
<D11>
<D111 operation="create">
<Node>something else</Node>
</D11>
</D11>
</D1>
<D2>
<D21>
</D21>
</D2>
</D>
</B>
</A>
Since D21 is an empty node i want to delete D21 and since now D2 is an empty node i want to delete D2 but since D has D1 i dont want to delete D.
由于 D21 是一个空节点我想删除 D21,因为现在 D2 是一个空节点我想删除 D2 但由于 D 有 D1 我不想删除 D。
Similarly it is possible that i can get
同样,我有可能得到
<A>
<B>
<C>
</C>
</B>
</A>
Now since C is empty i want to delete C and then B and then eventually node A. I am trying to do this using removeChild() method in Node
现在因为 C 是空的,我想删除 C,然后是 B,最后是节点 A。我试图在Node 中使用 removeChild() 方法来做到这一点
But so far i am unable to remove them recursively. Any suggestions to remove them recursively?
但到目前为止,我无法递归删除它们。有什么建议可以递归删除它们吗?
I am recursively trying to get node and node length. But node length is of no help
我递归地尝试获取节点和节点长度。但是节点长度没有帮助
if(childNode.getChildNodes().getLength() == 0 ){
childNode.getParentNode().removeChild(childNode);
}
Regards
Dheeraj Joshi
问候
Dheeraj Joshi
采纳答案by Adam
This works, just create a recursive function that "goes deep" first, then removes empty nodes on the way "back up the tree", this will have the effect of removing both D21 and D2.
这是有效的,只需创建一个先“深入”的递归函数,然后在“备份树”的途中删除空节点,这将具有删除 D21 和 D2 的效果。
public static void main(String[] args) throws Exception {
DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
String input = "<A><B><C><C1><C11>something</C11><C12>something</C12></C1></C><D><D1><D11><D111 operation=\"create\"><Node>something else</Node></D111></D11></D1><D2><D21></D21></D2></D></B></A>";
Document document = builder.parse(new InputSource(new StringReader(
input)));
removeNodes(document);
Transformer transformer = TransformerFactory.newInstance()
.newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
StreamResult result = new StreamResult(new StringWriter());
transformer.transform(new DOMSource(document), result);
System.out.println(result.getWriter().toString());
}
public static void removeNodes(Node node) {
NodeList list = node.getChildNodes();
for (int i = 0; i < list.getLength(); i++) {
removeNodes(list.item(i));
}
boolean emptyElement = node.getNodeType() == Node.ELEMENT_NODE
&& node.getChildNodes().getLength() == 0;
boolean emptyText = node.getNodeType() == Node.TEXT_NODE
&& node.getNodeValue().trim().isEmpty();
if (emptyElement || emptyText) {
node.getParentNode().removeChild(node);
}
}
Output
输出
<A>
<B>
<C>
<C1>
<C11>something</C11>
<C12>something</C12>
</C1>
</C>
<D>
<D1>
<D11>
<D111 operation="create">
<Node>something else</Node>
</D111>
</D11>
</D1>
</D>
</B>
</A>
回答by fazed
I don't have enough rep to comment on @Adam's solution, but I was having an issue where after a node removal, the last sibling of that node was moved to index zero, causing it to not fully remove empty elements. The fix was to use a list to hold all of the nodes we want to recursively call for removal.
我没有足够的代表来评论@Adam 的解决方案,但我遇到了一个问题,在删除节点后,该节点的最后一个兄弟节点被移动到索引零,导致它无法完全删除空元素。修复方法是使用一个列表来保存我们想要递归调用删除的所有节点。
Also, there was a bug that removed empty elements that had attributes.
此外,还有一个错误会删除具有属性的空元素。
Solution to both issues:
两个问题的解决方法:
public static void removeEmptyNodes(Node node) {
NodeList list = node.getChildNodes();
List<Node> nodesToRecursivelyCall = new LinkedList();
for (int i = 0; i < list.getLength(); i++) {
nodesToRecursivelyCall.add(list.item(i));
}
for(Node tempNode : nodesToRecursivelyCall) {
removeEmptyNodes(tempNode);
}
boolean emptyElement = node.getNodeType() == Node.ELEMENT_NODE
&& node.getChildNodes().getLength() == 0;
boolean emptyText = node.getNodeType() == Node.TEXT_NODE
&& node.getNodeValue().trim().isEmpty();
if (emptyElement || emptyText) {
if(!node.hasAttributes()) {
node.getParentNode().removeChild(node);
}
}
}
回答by nigi
Just work with strings:
只需使用字符串:
Pattern emptyValueTag = Pattern.compile("\s*<\w+/>");
Pattern emptyTagMultiLine = Pattern.compile("\s*<\w+>\n*\s*</\w+>");
xml = emptyValueTag.matcher(xml).replaceAll("");
while (xml.length() != (xml = emptyTagMultiLine.matcher(xml).replaceAll("")).length()) {
}
return xml;
回答by user1516873
Use getTextContent()
on top-level element of DOM. If method return empty string or null, you can removed this node, because this node and all child nodesis empty. If method getTextContent()
return not empty string, call getTextContent
on every child of current node, and so on.
See documentation.
用于getTextContent()
DOM 的顶级元素。如果方法返回空字符串或null,则可以删除此节点,因为此节点和所有子节点都是空的。如果方法getTextContent()
返回非空字符串,则调用getTextContent
当前节点的每个子节点,依此类推。
请参阅文档。
回答by Muenuddeen Shekh
public class RemoveEmprtElement {
public static void main(String[] args) {
ReadFile readFile =new ReadFile();
String strXml=readFile.readFileFromPath(new File("sampleXml4.xml"));
RemoveEmprtElement elementEmprtElement=new RemoveEmprtElement();
DocumentBuilder dBuilder = null;
Document doc = null;
try {
dBuilder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
doc = dBuilder.parse(new ByteArrayInputStream(strXml.getBytes()));
elementEmprtElement.getEmptyNodes(doc);
TransformerFactory tf = TransformerFactory.newInstance();
Transformer trans = tf.newTransformer();
StreamResult result = new StreamResult(new StringWriter());
trans.transform(new DOMSource(doc), result);
System.out.println(result.getWriter().toString());
}catch(Exception e) {
e.printStackTrace();
}
}
private void getEmptyNodes(Document doc){
try {
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
XPathExpression expr = xpath.compile("//*[not(*)]");
Object resultNS = expr.evaluate(doc, XPathConstants.NODESET);
NodeList nodes = (NodeList) resultNS;
for(int i =0 ; i < nodes.getLength() ; i++){
Node node = nodes.item(i);
boolean emptyElement = node.getNodeType() == Node.ELEMENT_NODE
&& node.getChildNodes().getLength() == 0;
boolean emptyText = node.getNodeType() == Node.TEXT_NODE
&& node.getNodeValue().trim().isEmpty();
if (emptyElement || emptyText) {
xmlNodeRemove(doc,findPath(node));
getEmptyNodes(doc);
}
}
}catch(Exception e) {
e.printStackTrace();
}
}
private void xmlNodeRemove(Document doc,String xmlNodeLocation){
try {
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
XPathExpression expr = xpath.compile(xmlNodeLocation);
Object resultNS = expr.evaluate(doc, XPathConstants.NODESET);
NodeList nodes = (NodeList) resultNS;
Node node =nodes.item(0);
if(node!=null && node.getParentNode()!=null && node.getParentNode().hasChildNodes()){
node.getParentNode().removeChild(node);
}
}catch(Exception e) {
e.printStackTrace();
}
}
private String findPath(Node n) {
String path="";
if(n==null){
return path;
}else if(n.getNodeName().equals("#document")){
return "";
}
else{
path=n.getNodeName();
path=findPath(n.getParentNode())+"/"+path;
}
return path;
}
}