java 将 HTML 转换为 DOCX

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/26297668/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-02 09:40:12  来源:igfitidea点击:

Convert HTML to DOCX

javadocx4j

提问by MrWayne

My question is very specific and I hope that someone has done this conversion from HTMLto DOCX.

我的问题很具体,我希望有人完成了从 HTML 到 DOCX 的转换。

To do this I took a sample code from github and tried it in my local Eclipse Setup.

为此,我从 github 中获取了一个示例代码,并在我本地的 Eclipse 设置中进行了尝试。

import java.io.File;
import java.io.FileNotFoundException;

import javax.xml.bind.JAXBException;

import org.docx4j.convert.in.xhtml.XHTMLImporterImpl;
import org.docx4j.openpackaging.exceptions.Docx4JException;
import org.docx4j.openpackaging.exceptions.InvalidFormatException;
import org.docx4j.openpackaging.packages.WordprocessingMLPackage;
import org.docx4j.openpackaging.parts.WordprocessingML.NumberingDefinitionsPart;

public class HtmlToDocConvert {

    /**
     * @param args
     * @throws FileNotFoundException
     * @throws JAXBException
     * @throws Docx4JException
     */
    public static void main(String[] args) throws FileNotFoundException,
            JAXBException, Docx4JException {
        // TODO Auto-generated method stub

        // File file = new File("C:\TestWordToHtml\html\Test.html");

        String inputfilepath = "C:\TestWordToHtml\html\Test.html";

        try {

            WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage
                    .createPackage();

            NumberingDefinitionsPart ndp = new NumberingDefinitionsPart();
            wordMLPackage.getMainDocumentPart().addTargetPart(ndp);
            ndp.unmarshalDefaultNumbering();

            XHTMLImporterImpl xHTMLImporter = new XHTMLImporterImpl(
                    wordMLPackage);
            xHTMLImporter.setHyperlinkStyle("Hyperlink");
            wordMLPackage.getMainDocumentPart().getContent().addAll(
                    xHTMLImporter.convert(new File(inputfilepath), null));

            File output = new java.io.File(System.getProperty("user.dir")
                    + "/html_output.docx");
            wordMLPackage.save(output);
            System.out.println("done");

            System.out.println("file path where it is stored is" + " "
                    + output.getAbsolutePath());

        }

        catch (InvalidFormatException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }

    }

}

Above code is giving me an error as follows

上面的代码给了我一个错误如下

Exception in thread "main" java.lang.NoSuchMethodError: org.docx4j.org.xhtmlrenderer.docx.DocxRenderer.(Ljava/lang/String;)V at org.docx4j.convert.in.xhtml.XHTMLImporterImpl.getRenderer(XHTMLImporterImpl.java:252) at org.docx4j.convert.in.xhtml.XHTMLImporterImpl.convert(XHTMLImporterImpl.java:466) at HtmlToDocConvert.main(HtmlToDocConvert.java:41)

线程“main”中的异常 java.lang.NoSuchMethodError: org.docx4j.org.xhtmlrenderer.docx.DocxRenderer.(Ljava/lang/String;)V at org.docx4j.convert.in.xhtml.XHTMLImporterImpl.getRenderer(XHTMLImporterImpl. java:252) 在 org.docx4j.convert.in.xhtml.XHTMLImporterImpl.convert(XHTMLImporterImpl.java:466) 在 HtmlToDocConvert.main(HtmlToDocConvert.java:41)

Jars in my projects to achieve this are as following.

我的项目中实现这一目标的罐子如下。

docx4j-3.2.1.jar
docx4j-ImportXHTML-3.2.1.jar
slf4j-api-1.7.7.jar
slf4j-log4j12-1.7.7.jar
xhtmlrenderer-1.0.0.jar
log4j.jar

docx4j-3.2.1.jar
docx4j-ImportXHTML-3.2.1.jar
slf4j-api-1.7.7.jar
slf4j-log4j12-1.7.7.jar
xhtmlrenderer-1.0.0.jar
log4j.jar

I have stripped the xhtmlrendere.jar file to view DOCRendered class and saw that there was no init method inside it.I have spent close to half a day to figure out this thing and I am not sure if this is correct way to do the conversion or this is even possible.

我已经剥离了 xhtmlrendere.jar 文件来查看 DOCRendered 类,发现里面没有 init 方法。我花了将近半天的时间来弄清楚这件事,我不确定这是否是正确的转换方法或者这甚至是可能的。

If someone has done this can he/she sent me correct xhtmlrenderer.jarfile or anypother dependency to achieve this simple task.

如果有人这样做了,他/她可以向我发送正确的xhtmlrenderer.jar文件或任何其他依赖项来完成这个简单的任务。

Thanks in Advance

提前致谢

Regards, Bhanu

问候, 巴努

回答by Alex Tape

This is not the complete example, is it? Just take a look at ConvertInXHTMLFile.javafrom docx4jexamples.

这不是完整的例子,是吗?只需从docx4j示例中查看ConvertInXHTMLFile.java 即可

IMHO you are missing basic parts of the procedure. Furthermore, this topic has been discussed already:

恕我直言,您缺少程序的基本部分。此外,这个话题已经讨论过:

Convert html to doc in java

在java中将html转换为doc

How to convert HTML to a Microsoft Word document ?

如何将 HTML 转换为 Microsoft Word 文档?

Convert HTML to Microsoft Word Document in Java

在 Java 中将 HTML 转换为 Microsoft Word 文档

how to convert HTML to .docx using docx4j?

如何使用 docx4j 将 HTML 转换为 .docx?

回答by nanosoft

check code here. Api used is docx4j-ImportXHTML. Code is simple to follow. Just pass on your xhtml to api as in code and it will do the needful.

检查代码在这里。使用的 Api 是 docx4j-ImportXHTML。代码很容易遵循。只需像在代码中一样将您的 xhtml 传递给 api,它就会完成所需的工作。

回答by Mehdi Roostaeian

I had the same problem, Replace your xhtmlrenderer-1.0.0 jar file with version 3.0.0 . This is Maven Repository link

我遇到了同样的问题,用版本 3.0.0 替换你的 xhtmlrenderer-1.0.0 jar 文件。 这是 Maven 存储库链接