如何从 Java 漂亮地打印 XML?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/139076/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-11 08:46:54  来源:igfitidea点击:

How to pretty print XML from Java?

javaxmlpretty-print

提问by Steve McLeod

I have a Java String that contains XML, with no line feeds or indentations. I would like to turn it into a String with nicely formatted XML. How do I do this?

我有一个包含 XML 的 Java 字符串,没有换行符或缩进。我想将其转换为具有格式良好的 XML 的字符串。我该怎么做呢?

String unformattedXml = "<tag><nested>hello</nested></tag>";
String formattedXml = new [UnknownClass]().format(unformattedXml);

Note: My input is a String. My output is a String.

注意:我的输入是一个String。我的输出是一个String

(Basic) mock result:

(基本)模拟结果:

<?xml version="1.0" encoding="UTF-8"?>
<root>
  <tag>
    <nested>hello</nested>
  </tag>
</root>

采纳答案by Steve McLeod

Now it's 2012 and Java can do more than it used to with XML, I'd like to add an alternative to my accepted answer. This has no dependencies outside of Java 6.

现在是 2012 年,Java 可以比以前使用 XML 做更多的事情,我想为我接受的答案添加一个替代方案。这在 Java 6 之外没有依赖项。

import org.w3c.dom.Node;
import org.w3c.dom.bootstrap.DOMImplementationRegistry;
import org.w3c.dom.ls.DOMImplementationLS;
import org.w3c.dom.ls.LSSerializer;
import org.xml.sax.InputSource;

import javax.xml.parsers.DocumentBuilderFactory;
import java.io.StringReader;

/**
 * Pretty-prints xml, supplied as a string.
 * <p/>
 * eg.
 * <code>
 * String formattedXml = new XmlFormatter().format("<tag><nested>hello</nested></tag>");
 * </code>
 */
public class XmlFormatter {

    public String format(String xml) {

        try {
            final InputSource src = new InputSource(new StringReader(xml));
            final Node document = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(src).getDocumentElement();
            final Boolean keepDeclaration = Boolean.valueOf(xml.startsWith("<?xml"));

        //May need this: System.setProperty(DOMImplementationRegistry.PROPERTY,"com.sun.org.apache.xerces.internal.dom.DOMImplementationSourceImpl");


            final DOMImplementationRegistry registry = DOMImplementationRegistry.newInstance();
            final DOMImplementationLS impl = (DOMImplementationLS) registry.getDOMImplementation("LS");
            final LSSerializer writer = impl.createLSSerializer();

            writer.getDomConfig().setParameter("format-pretty-print", Boolean.TRUE); // Set this to true if the output needs to be beautified.
            writer.getDomConfig().setParameter("xml-declaration", keepDeclaration); // Set this to true if the declaration is needed to be outputted.

            return writer.writeToString(document);
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
    }

    public static void main(String[] args) {
        String unformattedXml =
                "<?xml version=\"1.0\" encoding=\"UTF-8\"?><QueryMessage\n" +
                        "        xmlns=\"http://www.SDMX.org/resources/SDMXML/schemas/v2_0/message\"\n" +
                        "        xmlns:query=\"http://www.SDMX.org/resources/SDMXML/schemas/v2_0/query\">\n" +
                        "    <Query>\n" +
                        "        <query:CategorySchemeWhere>\n" +
                        "   \t\t\t\t\t         <query:AgencyID>ECB\n\n\n\n</query:AgencyID>\n" +
                        "        </query:CategorySchemeWhere>\n" +
                        "    </Query>\n\n\n\n\n" +
                        "</QueryMessage>";

        System.out.println(new XmlFormatter().format(unformattedXml));
    }
}

回答by Lorenzo Boccaccia

Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "2");
//initialize StreamResult with File object to save to file
StreamResult result = new StreamResult(new StringWriter());
DOMSource source = new DOMSource(doc);
transformer.transform(source, result);
String xmlString = result.getWriter().toString();
System.out.println(xmlString);

Note: Results may vary depending on the Java version. Search for workarounds specific to your platform.

注意:结果可能因 Java 版本而异。搜索特定于您的平台的解决方法。

回答by Kevin Hakanson

Since you are starting with a String, you need to covert to a DOMobject (e.g. Node) before you can use the Transformer. However, if you know your XML string is valid, and you don't want to incur the memory overhead of parsing a string into a DOM, then running a transform over the DOM to get a string back - you could just do some old fashioned character by character parsing. Insert a newline and spaces after every </...>characters, keep and indent counter (to determine the number of spaces) that you increment for every <...>and decrement for every </...>you see.

由于您从 a 开始String,您需要先转换到一个DOM对象(例如Node),然后才能使用Transformer. 但是,如果您知道您的 XML 字符串是有效的,并且您不想招致将字符串解析为 DOM 的内存开销,然后在 DOM 上运行转换以获取字符串 - 您可以做一些老式的逐字符解析。在每个</...>字符后插入一个换行符和空格,保留并缩进计数器(以确定空格数),您<...>每次</...>看到的都会增加,每次看到的都会减少。

Disclaimer - I did a cut/paste/text edit of the functions below, so they may not compile as is.

免责声明 - 我对以下函数进行了剪切/粘贴/文本编辑,因此它们可能无法按原样编译。

public static final Element createDOM(String strXML) 
    throws ParserConfigurationException, SAXException, IOException {

    DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
    dbf.setValidating(true);
    DocumentBuilder db = dbf.newDocumentBuilder();
    InputSource sourceXML = new InputSource(new StringReader(strXML));
    Document xmlDoc = db.parse(sourceXML);
    Element e = xmlDoc.getDocumentElement();
    e.normalize();
    return e;
}

public static final void prettyPrint(Node xml, OutputStream out)
    throws TransformerConfigurationException, TransformerFactoryConfigurationError, TransformerException {
    Transformer tf = TransformerFactory.newInstance().newTransformer();
    tf.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
    tf.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
    tf.setOutputProperty(OutputKeys.INDENT, "yes");
    tf.transform(new DOMSource(xml), new StreamResult(out));
}

回答by Steve McLeod

Here's an answer to my own question. I combined the answers from the various results to write a class that pretty prints XML.

这是我自己的问题的答案。我结合了各种结果的答案,编写了一个可以漂亮打印 XML 的类。

No guarantees on how it responds with invalid XML or large documents.

无法保证它如何响应无效的 XML 或大型文档。

package ecb.sdw.pretty;

import org.apache.xml.serialize.OutputFormat;
import org.apache.xml.serialize.XMLSerializer;
import org.w3c.dom.Document;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import java.io.IOException;
import java.io.StringReader;
import java.io.StringWriter;
import java.io.Writer;

/**
 * Pretty-prints xml, supplied as a string.
 * <p/>
 * eg.
 * <code>
 * String formattedXml = new XmlFormatter().format("<tag><nested>hello</nested></tag>");
 * </code>
 */
public class XmlFormatter {

    public XmlFormatter() {
    }

    public String format(String unformattedXml) {
        try {
            final Document document = parseXmlFile(unformattedXml);

            OutputFormat format = new OutputFormat(document);
            format.setLineWidth(65);
            format.setIndenting(true);
            format.setIndent(2);
            Writer out = new StringWriter();
            XMLSerializer serializer = new XMLSerializer(out, format);
            serializer.serialize(document);

            return out.toString();
        } catch (IOException e) {
            throw new RuntimeException(e);
        }
    }

    private Document parseXmlFile(String in) {
        try {
            DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
            DocumentBuilder db = dbf.newDocumentBuilder();
            InputSource is = new InputSource(new StringReader(in));
            return db.parse(is);
        } catch (ParserConfigurationException e) {
            throw new RuntimeException(e);
        } catch (SAXException e) {
            throw new RuntimeException(e);
        } catch (IOException e) {
            throw new RuntimeException(e);
        }
    }

    public static void main(String[] args) {
        String unformattedXml =
                "<?xml version=\"1.0\" encoding=\"UTF-8\"?><QueryMessage\n" +
                        "        xmlns=\"http://www.SDMX.org/resources/SDMXML/schemas/v2_0/message\"\n" +
                        "        xmlns:query=\"http://www.SDMX.org/resources/SDMXML/schemas/v2_0/query\">\n" +
                        "    <Query>\n" +
                        "        <query:CategorySchemeWhere>\n" +
                        "   \t\t\t\t\t         <query:AgencyID>ECB\n\n\n\n</query:AgencyID>\n" +
                        "        </query:CategorySchemeWhere>\n" +
                        "    </Query>\n\n\n\n\n" +
                        "</QueryMessage>";

        System.out.println(new XmlFormatter().format(unformattedXml));
    }

}

回答by anjanb

there is a very nice command line xml utility called xmlstarlet(http://xmlstar.sourceforge.net/) that can do a lot of things which a lot of people use.

有一个非常好的命令行 xml 实用程序,称为 xmlstarlet(http://xmlstar.sourceforge.net/),它可以做很多人使用的很多事情。

Your could execute this program programatically using Runtime.exec and then readin the formatted output file. It has more options and better error reporting than a few lines of Java code can provide.

您可以使用 Runtime.exec 以编程方式执行此程序,然后读入格式化的输出文件。它具有比几行 Java 代码所能提供的更多选项和更好的错误报告。

download xmlstarlet : http://sourceforge.net/project/showfiles.php?group_id=66612&package_id=64589

下载xmlstarlet:http: //sourceforge.net/project/showfiles.php?group_id =66612&package_id =64589

回答by mlo55

I've pretty printed in the past using the org.dom4j.io.OutputFormat.createPrettyPrint()method

我过去曾使用org.dom4j.io.OutputFormat.createPrettyPrint()方法打印过

public String prettyPrint(final String xml){  

    if (StringUtils.isBlank(xml)) {
        throw new RuntimeException("xml was null or blank in prettyPrint()");
    }

    final StringWriter sw;

    try {
        final OutputFormat format = OutputFormat.createPrettyPrint();
        final org.dom4j.Document document = DocumentHelper.parseText(xml);
        sw = new StringWriter();
        final XMLWriter writer = new XMLWriter(sw, format);
        writer.write(document);
    }
    catch (Exception e) {
        throw new RuntimeException("Error pretty printing xml:\n" + xml, e);
    }
    return sw.toString();
}

回答by StaxMan

Regarding comment that "you must first build a DOM tree": No, you need not and should not do that.

关于“您必须首先构建 DOM 树”的评论:不,您不需要也不应该这样做。

Instead, create a StreamSource (new StreamSource(new StringReader(str)), and feed that to the identity transformer mentioned. That'll use SAX parser, and result will be much faster. Building an intermediate tree is pure overhead for this case. Otherwise the top-ranked answer is good.

相反,创建一个 StreamSource (new StreamSource(new StringReader(str)),并将其提供给提到的标识转换器。这将使用 SAX 解析器,结果会快得多。在这种情况下,构建中间树纯粹是开销。否则排名靠前的答案是好的。

回答by Jonik

If using a 3rd party XML library is ok, you can get away with something significantly simpler than what the currently highest-votedanswerssuggest.

如果使用 3rd 方 XML 库没问题,那么您可以得到比当前投票最高的答案所建议的要简单得多的东西。

It was stated that both input and output should be Strings, so here's a utility method that does just that, implemented with the XOMlibrary:

据说输入和输出都应该是字符串,所以这里有一个实用方法可以做到这一点,用XOM库实现:

import nu.xom.*;
import java.io.*;

[...]

public static String format(String xml) throws ParsingException, IOException {
    ByteArrayOutputStream out = new ByteArrayOutputStream();
    Serializer serializer = new Serializer(out);
    serializer.setIndent(4);  // or whatever you like
    serializer.write(new Builder().build(xml, ""));
    return out.toString("UTF-8");
}

I tested that it works, and the results do notdepend on your JRE version or anything like that. To see how to customise the output format to your liking, take a look at the SerializerAPI.

我测试了它的工作原理,结果依赖于你的 JRE 版本或类似的东西。要了解如何根据自己的喜好自定义输出格式,请查看SerializerAPI。

This actually came out longer than I thought - some extra lines were needed because Serializerwants an OutputStreamto write to. But note that there's very little code for actual XML twiddling here.

这实际上比我想象的要长 - 需要一些额外的行,因为Serializer想要OutputStream写入。但请注意,此处用于实际处理 XML 的代码很少。

(This answer is part of my evaluation of XOM, which was suggestedas one option in my question about the best Java XML libraryto replace dom4j. For the record, with dom4j you could achieve this with similar ease using XMLWriterand OutputFormat. Edit: ...as demonstrated in mlo55's answer.)

(这个答案是我对 XOM 评估的一部分,在我关于替代 dom4j的最佳 Java XML 库的问题中,有人建议将其作为一个选项。为了记录,使用 dom4j,您可以轻松地使用和来实现这一点。编辑:.. .如mlo55 的回答所示。)XMLWriterOutputFormat

回答by dfa

a simpler solution based on this answer:

基于此答案的更简单的解决方案

public static String prettyFormat(String input, int indent) {
    try {
        Source xmlInput = new StreamSource(new StringReader(input));
        StringWriter stringWriter = new StringWriter();
        StreamResult xmlOutput = new StreamResult(stringWriter);
        TransformerFactory transformerFactory = TransformerFactory.newInstance();
        transformerFactory.setAttribute("indent-number", indent);
        Transformer transformer = transformerFactory.newTransformer(); 
        transformer.setOutputProperty(OutputKeys.INDENT, "yes");
        transformer.transform(xmlInput, xmlOutput);
        return xmlOutput.getWriter().toString();
    } catch (Exception e) {
        throw new RuntimeException(e); // simple exception handling, please review it
    }
}

public static String prettyFormat(String input) {
    return prettyFormat(input, 2);
}

testcase:

测试用例:

prettyFormat("<root><child>aaa</child><child/></root>");

returns:

返回:

<?xml version="1.0" encoding="UTF-8"?>
<root>
  <child>aaa</child>
  <child/>
</root>

回答by Sandeep Phukan

Hmmm... faced something like this and it is a known bug ... just add this OutputProperty ..

嗯......遇到这样的事情,这是一个已知的错误......只需添加这个 OutputProperty ..

transformer.setOutputProperty(OutputPropertiesFactory.S_KEY_INDENT_AMOUNT, "8");

Hope this helps ...

希望这可以帮助 ...

回答by Mark Pope

Here's a way of doing it using dom4j:

这是一种使用dom4j 的方法

Imports:

进口:

import org.dom4j.Document;  
import org.dom4j.DocumentHelper;  
import org.dom4j.io.OutputFormat;  
import org.dom4j.io.XMLWriter;

Code:

代码:

String xml = "<your xml='here'/>";  
Document doc = DocumentHelper.parseText(xml);  
StringWriter sw = new StringWriter();  
OutputFormat format = OutputFormat.createPrettyPrint();  
XMLWriter xw = new XMLWriter(sw, format);  
xw.write(doc);  
String result = sw.toString();