Java 说 XML 文档格式不正确

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2853242/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-13 13:39:30  来源:igfitidea点击:

Java saying XML Document Not Well Formed

javaxmlparsingxmldocumentnon-well-formed

提问by Pyroclastic

Java's XML parser seems to be thinking that my XML document is not well formed following the root element. But I've validated it with several tools and they all disagree. It's probably an error in my code rather than in the document itself. I'd really appreciate any help you all could offer me.

Java 的 XML 解析器似乎认为我的 XML 文档在根元素之后格式不正确。但我已经用几个工具验证过,他们都不同意。这可能是我的代码中的错误,而不是文档本身的错误。我真的很感激你们能为我提供的任何帮助。

Here is my Java method:

这是我的 Java 方法:

private void loadFromXMLFile(File f) throws ParserConfigurationException, IOException, SAXException {
    File file = f;
    DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
    DocumentBuilder db;
    Document doc = null;
    db = dbf.newDocumentBuilder();
    doc = db.parse(file);
    doc.getDocumentElement().normalize();
    String desc = "";
    String due = "";
    String comment = "";
    NodeList tasksList = doc.getElementsByTagName("task");
    for (int i = 0; i  tasksList.getLength(); i++) {
        NodeList attributes = tasksList.item(i).getChildNodes();
        for (int j = 0; i < attributes.getLength(); j++) {
        Node attribute = attributes.item(i);
        if (attribute.getNodeName() == "description") {
            desc = attribute.getTextContent();
        }
        if (attribute.getNodeName() == "due") {
            due = attribute.getTextContent();
        }
        if (attribute.getNodeName() == "comment") {
            comment = attribute.getTextContent();
        }
        tasks.add(new Task(desc, due, comment));
        }
        desc = "";
        due = "";
        comment = "";
    }
}

The following is the XML file I'm trying to load:

以下是我正在尝试加载的 XML 文件:

<?xml version="1.0"?>  
<tasklist>  
    <task>  
        <description>Task 1</description>  
        <due>Due date 1</due>  
        <comment>Comment 1</comment>  
        <completed>false</completed>  
    </task>  
    <task>  
        <description>Task 2</description>  
        <due>Due date 2</due>  
        <comment>Comment 2</comment>  
        <completed>false</completed>  
    </task>  
    <task>  
        <description>Task 3</description>  
        <due>Due date 3</due>  
        <comment>Comment 3</comment>  
        <completed>true</completed>  
    </task>  
</tasklist>

And here is the error message java is throwing for me:

这是java为我抛出的错误消息:

run:
[Fatal Error] tasks.xml:28:3: The markup in the document following the root element must be well-formed.
May 17, 2010 6:07:02 PM todolist.TodoListGUI <init>
SEVERE: null
org.xml.sax.SAXParseException: The markup in the document following the root element must be well-formed.
        at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:239)
        at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:283)
        at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:208)
        at todolist.TodoListGUI.loadFromXMLFile(TodoListGUI.java:199)
        at todolist.TodoListGUI.<init>(TodoListGUI.java:42)
        at todolist.Main.main(Main.java:25)
BUILD SUCCESSFUL (total time: 19 seconds)

For reference TodoListGUI.java:199 is

供参考 TodoListGUI.java:199 是

doc = db.parse(file);

If context is helpful to anyone here, I'm trying to write a simple GUI application to manage a todo list that can read and write to and from XML files defining the tasks.

如果上下文对这里的任何人都有帮助,我正在尝试编写一个简单的 GUI 应用程序来管理一个待办事项列表,该列表可以读写定义任务的 XML 文件。

回答by EAMann

Try changing your XML declaration to:

尝试将您的 XML 声明更改为:

<?xml version="1.0" encoding="UTF-8" ?>

回答by laz

I think there may be something wrong with the actual file. When I copy your code but use the XML as a string input to the parser it works fine (after fixing a couple of issues - attributes.item(i)should be attributes.item(j)and you need to break out of the loop when attribute == null).

我认为实际文件可能有问题。当我复制您的代码但使用 XML 作为解析器的字符串输入时,它工作正常(在修复了几个问题之后 -attributes.item(i)应该是attributes.item(j)并且您需要在 时跳出循环attribute == null)。

In trying to reproduce your error, I can get the same message if I add another <tasklist></tasklist>element. This is because the XML no longer has a single root element (tasklist). Is this the problem you are seeing? Does the XML in tasks.xmlhave a single root element?

在尝试重现您的错误时,如果添加另一个<tasklist></tasklist>元素,我会收到相同的消息。这是因为 XML 不再具有单个根元素(任务列表)。这是您看到的问题吗?中的 XML 是否tasks.xml只有一个根元素?

回答by ewg

For what it's worth, the Scala REPL successfully parsed your markup.

无论如何,Scala REPL 成功解析了您的标记。

scala> val tree = <tasklist>
 | <task>
 | <description>Task 1</description>
 | <due>Due date 1</due>
 | <comment>Comment 1</comment>
 | <completed>false</completed>
 | </task>
 | <task>
 | <description>Task 2</description>
 | <due>Due date 2</due>
 | <comment>Comment 2</comment>
 | <completed>false</completed>
 | </task>
 | <task>
 | <description>Task 3</description>
 | <due>Due date 3</due>
 | <comment>Comment 3</comment>
 | <completed>true</completed>
 | </task>
 | </tasklist>
tree: scala.xml.Elem = 
<tasklist>
<task>
<description>Task 1</description>
<due>Due date 1</due>
<comment>Comment 1</comment>
<completed>false</completed>
</task>
<task>
<description>Task 2</description>
<due>Due date 2</due>
<comment>Comment 2</comment>
<completed>false</completed>
</task>
<task>
<description>Task 3</description>
<due>Due date 3</due>
<comment>Comment 3</comment>
<completed>true</completed>
</task>
</tasklist>

回答by BalusC

org.xml.sax.SAXParseException: The markup in the document following the root element must be well-formed.

org.xml.sax.SAXParseException:文档中根元素后面的标记必须格式正确。

This particular exception indicates that there is more than one root element in the XML document. In other words, the <tasklist>is not the only root element. To take your XML document as an example, think of one without the <tasklist>element and with three <task>elements in the root. This would cause this kind of exception.

这一特殊的异常表明 XML 文档中有多个根元素。换句话说,<tasklist>不是唯一的根元素。以您的 XML 文档为例,请考虑一个没有元素但在根中<tasklist>有三个<task>元素的文档。这会导致这种异常。

Since the XML file you posted looks fine, the problem lies somewhere else. It look like that it is not parsing the XML file you expect that it is parsing. For quick debugging, add the following to top of your method:

由于您发布的 XML 文件看起来不错,问题出在其他地方。看起来它没有解析您期望它正在解析的 XML 文件。为了快速调试,请将以下内容添加到您的方法顶部:

System.out.println(f.getAbsolutePath());

Locate the file in the disk file system and verify it.

在磁盘文件系统中找到该文件并进行验证。

回答by BalusC

Another for what its worth, here is what I get when I saved your xml into a file called test.xmland ran it thru xmllint.

另一个值得一提的是,这是我将您的 xml 保存到一个名为的文件中test.xml并通过xmllint运行它时得到的结果

[jhr@Macintosh] [~]
xmllint test.xml
<?xml version="1.0"?>
<tasklist>  
    <task>  
        <description>Task 1</description>  
        <due>Due date 1</due>  
        <comment>Comment 1</comment>  
        <completed>false</completed>  
    </task>  
    <task>  
        <description>Task 2</description>  
        <due>Due date 2</due>  
        <comment>Comment 2</comment>  
        <completed>false</completed>  
    </task>  
    <task>  
        <description>Task 3</description>  
        <due>Due date 3</due>  
        <comment>Comment 3</comment>  
        <completed>true</completed>  
    </task>  
</tasklist>

seems to be fine. most likely you have some stray characters that you can't see in there somewhere in your actual file. Try viewing the actual file in an editor that will show non-printable characters, like someone else suggested if this isn't an English UTF-8 machine you might have some Unicode characters that you can't see that the parser does. That or you aren't loading the file that you think you are. Step debugging and see what the actual contents of the file are before it gets fed into the parser.

似乎没问题。很可能您有一些在实际文件中看不到的杂散字符。尝试在将显示不可打印字符的编辑器中查看实际文件,就像其他人建议的那样,如果这不是英文 UTF-8 机器,您可能有一些解析器看不到的 Unicode 字符。或者您没有加载您认为的文件。在将文件输入解析器之前,逐步调试并查看文件的实际内容是什么。

回答by ZZ Coder

Are you sure that's the everything in that file? The error is complaining that there are more markup after the current root. So there must be something else after </tasklist>.

你确定那是那个文件中的所有内容吗?错误是抱怨当前根之后有更多标记。所以后面一定还有别的东西</tasklist>

Sometimes, this error may be caused by non-printable characters. If you don't see anything, do a hexdump of the file.

有时,此错误可能是由不可打印的字符引起的。如果没有看到任何内容,请对文件进行十六进制转储。