java 如何以编程方式在java中将doc,docx文件转换为pdf
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/15636516/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
how to convert doc,docx files to pdf in java programatically
提问by user2211381
I am able to generate pdf from docx file using docx4j.But i need to convert doc file to pdf including images and tables. Is there any way to convert doc to docx in java. or (doc to pdf)?
我可以使用 docx4j 从 docx 文件生成 pdf。但是我需要将 doc 文件转换为 pdf,包括图像和表格。 有什么方法可以在 java 中将 doc 转换为 docx。或(从 doc 到 pdf)?
回答by JasonPlutext
docx4j contains org.docx4j.convert.in.Doc, which uses POI to read the .doc, but it is a proof of concept, not production ready code. Last I checked, there were limits to POI's HWPF parsing of a binary .doc.
docx4j 包含 org.docx4j.convert.in.Doc,它使用 POI 读取 .doc,但它是概念证明,而不是生产就绪代码。最后我检查过,POI 对二进制 .doc 的 HWPF 解析存在限制。
Further to mqchen's comment, you can use LibreOffice or OpenOffice to convert doc to docx. But if you are going to use LibreOffice or OpenOffice, you may as well use it to convert both .doc and .docx directly to PDF. Google 'jodconverter'.
除了 mqchen 的评论之外,您还可以使用 LibreOffice 或 OpenOffice 将 doc 转换为 docx。但如果您打算使用 LibreOffice 或 OpenOffice,您不妨使用它直接将 .doc 和 .docx 转换为 PDF。谷歌'jodconverter'。
回答by hd1
Cribbing off the POI unit tests, I came up with this to extract the text from a word document:
抄袭POI 单元测试,我想出了这个来从 word 文档中提取文本:
public String getText(String document) {
try
{
ZipInputStream is = new ZipInputStream( new FileInputStream(document));
try
{
is.getNextEntry();
ByteArrayOutputStream baos = new ByteArrayOutputStream();
try
{
IOUtils.copy( is, baos );
}
finally
{
baos.close();
}
byte[] byteArray = baos.toByteArray();
ByteArrayInputStream bais = new ByteArrayInputStream( byteArray );
HWPFDocument doc = new HWPFDocument( bais );
extractor = new WordExtractor(doc);
extractor.getText();
}
finally
{
is.close();
}
}
catch ( IOException e )
{
throw new RuntimeException( e );
}
}
And then, cribbing off the PDFBox user's guide for creation:
然后,抄袭 PDFBox 用户指南进行创建:
PDDocument pdDoc = new PDDocument();
PDPage page = new PDPage();
pdDoc.addPage(page);
PDFont font = PDType1Font.HELVETICA_BOLD;
PDPageContentStream contentStream = new PDPageContentStream(document, page);
contentStream.beginText();
contentStream.setFont(font, 12);
contentStream.moveTextPositionByAmount( 100, 700 );
contentStream.drawText(getText(documentPath));
contentStream.endText();
contentStream.close();
pdDoc.save("foo.pdf");
pdDoc.close();
I do hope that points you in the right direction, if not sorts you entirely.
我确实希望这会为您指明正确的方向,如果不能完全对您进行排序。
回答by Jabir
You can use jWordConvert for this.
您可以为此使用 jWordConvert。
jWordConvert is a Java library that can read and render Word documents natively to convert to PDF, to convert to images, or to print the documents automatically.
jWordConvert 是一个 Java 库,可以在本机读取和呈现 Word 文档以转换为 PDF、转换为图像或自动打印文档。
Details can be found at following link http://www.qoppa.com/wordconvert/
详细信息可以在以下链接中 找到 http://www.qoppa.com/wordconvert/