是否有任何允许合并 docx 文件的 Java 库(也许是 poi?)?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2494549/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-29 21:28:21  来源:igfitidea点击:

Is there any java library (maybe poi?) which allows to merge docx files?

javadocx

提问by Roman

I need to write a java application which can merge docx files. Any suggestions?

我需要编写一个可以合并 docx 文件的 java 应用程序。有什么建议?

采纳答案by BalusC

The following Java APIs are available to handle OpenXML MS Word documents with Java:

以下 Java API 可用于使用 Java 处理 OpenXML MS Word 文档:

There was one more, but I don't recall the name anymore.

还有一个,但我已经想不起名字了。

As to your functional requirement: merging two documents is technically tricky to achieve the result as the enduser would expect. Most API's won't allow that. You'll need to extract the desired information from two documents and then create one new document based on this information yourself.

至于您的功能要求:合并两个文档在技术上很难达到最终用户期望的结果。大多数 API 不允许这样做。您需要从两个文档中提取所需的信息,然后自己根据这些信息创建一个新文档。

回答by atott

With POI my solution is:

对于 POI,我的解决方案是:

public static void merge(InputStream src1, InputStream src2, OutputStream dest) throws Exception {
    OPCPackage src1Package = OPCPackage.open(src1);
    OPCPackage src2Package = OPCPackage.open(src2);
    XWPFDocument src1Document = new XWPFDocument(src1Package);        
    CTBody src1Body = src1Document.getDocument().getBody();
    XWPFDocument src2Document = new XWPFDocument(src2Package);
    CTBody src2Body = src2Document.getDocument().getBody();        
    appendBody(src1Body, src2Body);
    src1Document.write(dest);
}

private static void appendBody(CTBody src, CTBody append) throws Exception {
    XmlOptions optionsOuter = new XmlOptions();
    optionsOuter.setSaveOuter();
    String appendString = append.xmlText(optionsOuter);
    String srcString = src.xmlText();
    String prefix = srcString.substring(0,srcString.indexOf(">")+1);
    String mainPart = srcString.substring(srcString.indexOf(">")+1,srcString.lastIndexOf("<"));
    String sufix = srcString.substring( srcString.lastIndexOf("<") );
    String addPart = appendString.substring(appendString.indexOf(">") + 1, appendString.lastIndexOf("<"));
    CTBody makeBody = CTBody.Factory.parse(prefix+mainPart+addPart+sufix);
    src.set(makeBody);
}

With Docx4j my solution is:

使用 Docx4j 我的解决方案是:

public class MergeDocx {
    private static long chunk = 0;
    private static final String CONTENT_TYPE = "application/vnd.openxmlformats-officedocument.wordprocessingml.document";

    public void mergeDocx(InputStream s1, InputStream s2, OutputStream os) throws Exception {
        WordprocessingMLPackage target = WordprocessingMLPackage.load(s1);
        insertDocx(target.getMainDocumentPart(), IOUtils.toByteArray(s2));
        SaveToZipFile saver = new SaveToZipFile(target);
        saver.save(os);
    }

    private static void insertDocx(MainDocumentPart main, byte[] bytes) throws Exception {
            AlternativeFormatInputPart afiPart = new AlternativeFormatInputPart(new PartName("/part" + (chunk++) + ".docx"));
            afiPart.setContentType(new ContentType(CONTENT_TYPE));
            afiPart.setBinaryData(bytes);
            Relationship altChunkRel = main.addTargetPart(afiPart);

            CTAltChunk chunk = Context.getWmlObjectFactory().createCTAltChunk();
            chunk.setId(altChunkRel.getId());

            main.addObject(chunk);
    }
}

回答by Supun Sameera

Aspose API is the best so far for merging word doc or docx files so far but that is not free or open source, if you need a free and open source tools there are couple of API you can choose from, you can find a review on them here,

Aspose API 是迄今为止合并 word doc 或 docx 文件的最佳选择,但它不是免费或开源的,如果您需要免费和开源工具,您可以选择几个 API,您可以在他们在这里,

http://www.esupu.com/open-source-office-document-java-api-review/

http://www.esupu.com/open-source-office-document-java-api-review/

回答by Matt Ball

It sure looks like POI can work with docxfiles. Are you trying to figure out how to merge them?

看起来 POI 确实可以处理docx文件。您是否想弄清楚如何合并它们?

How to extract plain text from a DOCX file using the new OOXML support in Apache POI 3.5?

如何使用 Apache POI 3.5 中新的 OOXML 支持从 DOCX 文件中提取纯文本?