复制 XML 文件以在 Java 中写入新的 XML 文件

Question

提问by This 0ne Pr0grammer

I was wondering if anyone knew if it was possible to use one of the XML parsers in Java to read line-by-line, each of the rows in an XML document and basically reproduce the same document in another XML file? (In my case, take only the lines from Point X to Point Y in the document and copy them). I thought about using using the bufferedreader and bufferedwriter in a small trial run, but it did not quite output the file properly. Below is what I was doing in my trial run, but it is not what I want. So does anyone have any experience with this or have any thoughts or suggestions to offer? Thank you in advance.

我想知道是否有人知道是否可以使用 Java 中的一个 XML 解析器逐行读取 XML 文档中的每一行，并基本上在另一个 XML 文件中重现相同的文档？（在我的例子中，只取文档中从 X 点到 Y 点的线并复制它们）。我想过在一个小的试运行中使用 bufferedreader 和 bufferedwriter，但它没有完全正确地输出文件。下面是我在试运行中所做的，但这不是我想要的。那么有没有人有这方面的经验或有任何想法或建议可以提供？先感谢您。

JAVA CODE

爪哇代码

public class IPDriver 
{
    public static void main(String[] args) throws IOException
    {
        BufferedReader reader = new BufferedReader(new FileReader("C:/Documents and Settings/user/workspace/Intern Project/Proposals/Converted Proposals/Extracted Items/ProposalOne/word/document.xml"));
        BufferedWriter writer = new BufferedWriter(new FileWriter("C:/Documents and Settings/user/workspace/Intern Project/Proposals/Converted Proposals/Extracted Items/ProposalOne/word/tempdocument.xml"));

        String line = null;

        while ((line = reader.readLine()) != null)
        {
            writer.write(line);
        }

        // Close to unlock.
        reader.close();
        // Close to unlock and flush to disk.
        writer.close();
    }
}

Working JAVA Code Thanks To Ted Hopp

工作 JAVA 代码感谢 Ted Hopp

public class IPDriver 
    {
        public static void main(String[] args) throws IOException
        {
            BufferedReader reader = new BufferedReader(new InputStreamReader(new FileInputStream("C:/Documents and Settings/user/workspace/Intern Project/Proposals/Converted Proposals/Extracted Items/ProposalOne/word/document.xml"), "UTF-8"));
            BufferedWriter writer = new BufferedWriter(new OutputStreamWriter(new FileOutputStream("C:/Documents and Settings/user/workspace/Intern Project/Proposals/Converted Proposals/Extracted Items/ProposalOne/word/tempdocument.xml"), "UTF-8"));

            String line = null;

            while ((line = reader.readLine()) != null)
            {
                writer.write(line);
            }

            // Close to unlock.
            reader.close();
            // Close to unlock and flush to disk.
            writer.close();
        }
    }

Answer 1

回答by Ted Hopp

If your code didn't copy the file over properly, my guess is that you have a character encoding problem. Since the default encoding for XML is UTF-8 and the default encoding for FileReader is the default encoding for your platform, I suggest doing this instead:

如果您的代码没有正确复制文件，我的猜测是您有字符编码问题。由于 XML 的默认编码是 UTF-8，而 FileReader 的默认编码是您平台的默认编码，我建议改为这样做：

BufferedReader reader = new BufferedReader(
    new InputStreamReader(
        new FileInputStream("...input file path..."),
        "UTF-8"
    )
);
BufferedWriter writer = new BufferedWriter(
    new OutputStreamWriter(
        new FileOutputStream("...output file path..."),
        "UTF-8"
    )
);

XML parsers will give you elements (or element events), not lines. For instance, they cannot distinguish between variations in white space:

XML 解析器将为您提供元素（或元素事件），而不是行。例如，他们无法区分空白区域的变化：

<tag attr1="val1" attr2="val2" />

versus:

相对：

<tag attr1="val1"
     attr2="val2"
     />

If your requirements include distinguishing those two cases, then an XML parser approach would not work.

如果您的要求包括区分这两种情况，那么 XML 解析器方法将不起作用。

Answer 2

回答by StaxMan

If you just want a copy, do not make the rookie mistake of using a Reader but copy using InputStream/OutputStream. And even with Readers, why would you read it line by line? Just read buffer-fulls of characters.

如果你只是想要一个副本，不要犯使用Reader的菜鸟错误，而是使用InputStream/OutputStream进行复制。即使有读者，你为什么要逐行阅读？只需读取充满字符的缓冲区。

So why avoid Reader? Because it adds overhead of decoding bytes to characters (and requiring a Writer to encode from chars to bytes), which is of no value to you. And that can also introduce issues, if you make another common mistake of not specifying encoding to use for Reader or Writer -- that will then use whatever platform default encoding is, which may or may not be encoding that File you are reading is using.

那么为什么要避免阅读器呢？因为它增加了将字节解码为字符的开销（并且需要 Writer 将字符编码为字节），这对您没有任何价值。如果您犯了另一个未指定用于 Reader 或 Writer 的编码的常见错误，那么这也会带来问题——然后将使用任何平台默认编码，这可能会或可能不会对您正在阅读的文件进行编码。

Answer 3

回答by sudocode

You could easily link a reader and writer with StAX. Using that API, you could also easily create a filter to extract just the portions of the document you want. Here are a couple of links which might help:

您可以轻松地将读取器和写入器与 StAX 联系起来。使用该 API，您还可以轻松创建过滤器以仅提取所需的文档部分。这里有几个链接可能会有所帮助：

复制 XML 文件以在 Java 中写入新的 XML 文件

提问by This 0ne Pr0grammer

回答by Ted Hopp

回答by StaxMan

回答by sudocode

相关推荐

最近更新

标签

复制 XML 文件以在 Java 中写入新的 XML 文件

提问by This 0ne Pr0grammer

回答by Ted Hopp

回答by StaxMan

回答by sudocode

相关推荐

Java：String.replace(regex, string) 从 XML 中删除内容

java 为什么 ScheduledExecutorService 抛出异常后不再运行任务？

Java Micro ORM 等价物

用 Java 编写图像元数据，最好是 PNG

相关推荐

最近更新

标签