java 8 中的漂亮打印 XML

Question

提问by Hungry

I have an XML file stored as a DOM Document and I would like to pretty print it to the console, preferably without using an external library. I am aware that this question has been asked multiple times on this site, however none of the previous answers have worked for me.I am using java 8, so perhaps this is where my code differs from previous questions? I have also tried to set the transformer manually using code found from the web, however this just caused a not founderror.

我有一个存储为 DOM 文档的 XML 文件，我想将它打印到控制台，最好不使用外部库。我知道这个问题在这个网站上被问过多次，但是以前的答案都没有对我有用。我使用的是 java 8，所以这可能是我的代码与以前的问题不同的地方？我还尝试使用从网上找到的代码手动设置变压器，但这只会导致not found错误。

Here is my code which currently just outputs each xml element on a new line to the left of the console.

这是我的代码，目前只在控制台左侧的新行上输出每个 xml 元素。

import java.io.*;
import javax.xml.parsers.*;
import javax.xml.transform.*;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;

import org.w3c.dom.Document;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;


public class Test {
    public Test(){
        try {
            //java.lang.System.setProperty("javax.xml.transform.TransformerFactory", "org.apache.xalan.xsltc.trax.TransformerFactoryImpl");

            DocumentBuilderFactory dbFactory;
            DocumentBuilder dBuilder;
            Document original = null;
            try {
                dbFactory = DocumentBuilderFactory.newInstance();
                dBuilder = dbFactory.newDocumentBuilder();
                original = dBuilder.parse(new InputSource(new InputStreamReader(new FileInputStream("xml Store - Copy.xml"))));
            } catch (SAXException | IOException | ParserConfigurationException e) {
                e.printStackTrace();
            }
            StringWriter stringWriter = new StringWriter();
            StreamResult xmlOutput = new StreamResult(stringWriter);
            TransformerFactory tf = TransformerFactory.newInstance();
            //tf.setAttribute("indent-number", 2);
            Transformer transformer = tf.newTransformer();
            transformer.setOutputProperty(OutputKeys.METHOD, "xml");
            transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "4");
            transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "no");
            transformer.setOutputProperty(OutputKeys.INDENT, "yes");
            transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
            transformer.transform(new DOMSource(original), xmlOutput);
            java.lang.System.out.println(xmlOutput.getWriter().toString());
        } catch (Exception ex) {
            throw new RuntimeException("Error converting to String", ex);
        }
    }

    public static void main(String[] args){
        new Test();
    }

}

Answer 1

采纳答案by Aldo

I guess that the problem is related to blank text nodes(i.e. text nodes with only whitespaces) in the original file. You should try to programmatically remove them just after the parsing, using the following code. If you don't remove them, the Transformeris going to preserve them.

我猜这个问题与原始文件中的空白文本节点（即只有空格的文本节点）有关。您应该尝试使用以下代码在解析后以编程方式删除它们。如果您不删除它们，Transformer它将保留它们。

original.getDocumentElement().normalize();
XPathExpression xpath = XPathFactory.newInstance().newXPath().compile("//text()[normalize-space(.) = '']");
NodeList blankTextNodes = (NodeList) xpath.evaluate(original, XPathConstants.NODESET);

for (int i = 0; i < blankTextNodes.getLength(); i++) {
     blankTextNodes.item(i).getParentNode().removeChild(blankTextNodes.item(i));
}

Answer 2

回答by iCrazybest

Create xml file :

创建 xml 文件：

new FileInputStream("xml Store - Copy.xml") ;// result xml file format incorrect !

so that, when parse the content of the given input source as an XML document and return a new DOM object.

这样，当将给定输入源的内容解析为 XML 文档并返回一个新的 DOM 对象时。

Document original = null;
...
original.parse("data.xml");//input source as an XML document

Answer 3

回答by Tom

This works on Java 8:

这适用于 Java 8：

public static void main (String[] args) throws Exception {
    String xmlString = "<hello><from>ME</from></hello>";
    DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
    DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
    Document document = documentBuilder.parse(new InputSource(new StringReader(xmlString)));
    pretty(document, System.out, 2);
}

private static void pretty(Document document, OutputStream outputStream, int indent) throws Exception {
    TransformerFactory transformerFactory = TransformerFactory.newInstance();
    Transformer transformer = transformerFactory.newTransformer();
    transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
    if (indent > 0) {
        transformer.setOutputProperty(OutputKeys.INDENT, "yes");
        transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", Integer.toString(indent));
    }
    Result result = new StreamResult(outputStream);
    Source source = new DOMSource(document);
    transformer.transform(source, result);
}

Answer 4

回答by ThomasRS

I've written a simple classfor for removing whitespace in documents - supports command-line and does not use DOM / XPath.

我编写了一个简单的类来删除文档中的空格 - 支持命令行并且不使用 DOM / XPath。

Edit: Come to think of it, the project also contains a pretty-printer which handles existing whitespace:

编辑：想想看，该项目还包含一个处理现有空白的漂亮打印机：

PrettyPrinter prettyPrinter = PrettyPrinterBuilder.newPrettyPrinter().ignoreWhitespace().build();

Answer 5

回答by Stephan

In reply to Espinosa's comment, here is a solution when "the original xml is not already (partially) indented or contain new lines".

在回复 Espinosa 的评论时，这里是“原始 xml 尚未（部分）缩进或包含新行”时的解决方案。

Background

背景

Excerpt from the article (see Referencesbelow) inspiring this solution:

启发此解决方案的文章摘录（请参阅下面的参考资料）：

Based on the DOM specification, whitespaces outside the tags are perfectly valid and they are properly preserved. To remove them, we can use XPath's normalize-space to locate all the whitespace nodes and remove them first.

根据 DOM 规范，标签外的空格是完全有效的，并且会被正确保留。要删除它们，我们可以使用 XPath 的 normalize-space 来定位所有空白节点并首先删除它们。

Java Code

Java代码

public static String toPrettyString(String xml, int indent) {
    try {
        // Turn xml string into a document
        Document document = DocumentBuilderFactory.newInstance()
                .newDocumentBuilder()
                .parse(new InputSource(new ByteArrayInputStream(xml.getBytes("utf-8"))));

        // Remove whitespaces outside tags
        document.normalize();
        XPath xPath = XPathFactory.newInstance().newXPath();
        NodeList nodeList = (NodeList) xPath.evaluate("//text()[normalize-space()='']",
                                                      document,
                                                      XPathConstants.NODESET);

        for (int i = 0; i < nodeList.getLength(); ++i) {
            Node node = nodeList.item(i);
            node.getParentNode().removeChild(node);
        }

        // Setup pretty print options
        TransformerFactory transformerFactory = TransformerFactory.newInstance();
        transformerFactory.setAttribute("indent-number", indent);
        Transformer transformer = transformerFactory.newTransformer();
        transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
        transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
        transformer.setOutputProperty(OutputKeys.INDENT, "yes");

        // Return pretty print xml string
        StringWriter stringWriter = new StringWriter();
        transformer.transform(new DOMSource(document), new StreamResult(stringWriter));
        return stringWriter.toString();
    } catch (Exception e) {
        throw new RuntimeException(e);
    }
}

Sample usage

示例用法

String xml = "<root>" + //
             "\n   "  + //
             "\n<name>Coco Puff</name>" + //
             "\n        <total>10</total>    </root>";

System.out.println(toPrettyString(xml, 4));

Output

输出

<root>
    <name>Coco Puff</name>
    <total>10</total>
</root>

References

参考

Java: Properly Indenting XML Stringpublished on MyShittyCode
Save new XML node to file

Java：正确缩进在MyShittyCode 上发布的XML 字符串
将新的 XML 节点保存到文件

Answer 6

回答by Andrew

I didn't like any of the common XML formatting solutions because they all remove more than 1 consecutive new line character (for some reason, removing spaces/tabs and removing new line characters are inseparable...). Here's my solution, which was actually made for XHTML but should do the job with XML as well:

我不喜欢任何常见的 XML 格式化解决方案，因为它们都删除了超过 1 个连续的换行符（出于某种原因，删除空格/制表符和删除换行符是不可分割的......）。这是我的解决方案，它实际上是为 XHTML 制作的，但也应该用 XML 来完成这项工作：

public String GenerateTabs(int tabLevel) {
  char[] tabs = new char[tabLevel * 2];
  Arrays.fill(tabs, ' ');

  //Or:
  //char[] tabs = new char[tabLevel];
  //Arrays.fill(tabs, '\t');

  return new String(tabs);
}

public String FormatXHTMLCode(String code) {
  // Split on new lines.
  String[] splitLines = code.split("\n", 0);

  int tabLevel = 0;

  // Go through each line.
  for (int lineNum = 0; lineNum < splitLines.length; ++lineNum) {
    String currentLine = splitLines[lineNum];

    if (currentLine.trim().isEmpty()) {
      splitLines[lineNum] = "";
    } else if (currentLine.matches(".*<[^/!][^<>]+?(?<!/)>?")) {
      splitLines[lineNum] = GenerateTabs(tabLevel) + splitLines[lineNum];

      ++tabLevel;
    } else if (currentLine.matches(".*</[^<>]+?>")) {
      --tabLevel;

      if (tabLevel < 0) {
        tabLevel = 0;
      }

      splitLines[lineNum] = GenerateTabs(tabLevel) + splitLines[lineNum];
    } else if (currentLine.matches("[^<>]*?/>")) {
      splitLines[lineNum] = GenerateTabs(tabLevel) + splitLines[lineNum];

      --tabLevel;

      if (tabLevel < 0) {
        tabLevel = 0;
      }
    } else {
      splitLines[lineNum] = GenerateTabs(tabLevel) + splitLines[lineNum];
    }
  }

  return String.join("\n", splitLines);
}

It makes one assumption: that there are no <> characters except for those that comprise the XML/XHTML tags.

它做出一个假设：除了那些构成 XML/XHTML 标签的字符之外，没有 <> 字符。

Answer 7

回答by Valentyn Kolesnikov

Underscore-javahas static method U.formatXml(string). I am the maintainer of the project. Live example

Underscore-java有静态方法U.formatXml(string)。我是项目的维护者。活生生的例子

import com.github.underscore.lodash.U;

public class MyClass {
    public static void main(String args[]) {
        String xml = "<root>" + //
             "\n   "  + //
             "\n<name>Coco Puff</name>" + //
             "\n        <total>10</total>    </root>";

        System.out.println(U.formatXml(xml));
    }
}

Output:

输出：

<root>
   <name>Coco Puff</name>
   <total>10</total>
</root>

java 8 中的漂亮打印 XML

提问by Hungry

采纳答案by Aldo

回答by iCrazybest

回答by Tom

回答by ThomasRS

回答by Stephan

回答by Andrew

回答by Valentyn Kolesnikov

相关推荐

最近更新

标签

java 8 中的漂亮打印 XML

提问by Hungry

采纳答案by Aldo

回答by iCrazybest

回答by Tom

回答by ThomasRS

回答by Stephan

回答by Andrew

回答by Valentyn Kolesnikov

相关推荐

Java 存储 20 位数字的数据类型

Java 如何使用 MediaStore 在 Android Q 中保存图像？

Java 使用在类路径资源中定义的名称“entityManagerFactory”创建 bean 时出错（调用 init 方法失败）

Java getContentPane() 究竟做了什么？

相关推荐

最近更新

标签