Java 尝试使用 Apache poi 制作简单的 PDF 文档

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/51330192/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-11 00:04:41  来源:igfitidea点击:

Trying to make simple PDF document with Apache poi

javaapacheapache-poi

提问by Alex Kornhauser

I see the internet is riddled with people complaining about apache's pdf products, but I cannot find my particular usecase here. I am trying to do a simple Hello World with apache poi. Right now my code is as follows:

我看到互联网上充斥着抱怨 apache pdf 产品的人,但我在这里找不到我的特定用例。我正在尝试用 apache poi 做一个简单的 Hello World。现在我的代码如下:

public ByteArrayOutputStream export() throws IOException {
    //Blank Document
    XWPFDocument document = new XWPFDocument();

    //Write the Document in file system
    ByteArrayOutputStream out = new ByteArrayOutputStream();;

    //create table
    XWPFTable table = document.createTable();
    XWPFStyles styles = document.createStyles();
    styles.setSpellingLanguage("English");
    //create first row
    XWPFTableRow tableRowOne = table.getRow(0);
    tableRowOne.getCell(0).setText("col one, row one");
    tableRowOne.addNewTableCell().setText("col two, row one");
    tableRowOne.addNewTableCell().setText("col three, row one");

    //create second row
    XWPFTableRow tableRowTwo = table.createRow();
    tableRowTwo.getCell(0).setText("col one, row two");
    tableRowTwo.getCell(1).setText("col two, row two");
    tableRowTwo.getCell(2).setText("col three, row two");

    //create third row
    XWPFTableRow tableRowThree = table.createRow();
    tableRowThree.getCell(0).setText("col one, row three");
    tableRowThree.getCell(1).setText("col two, row three");
    tableRowThree.getCell(2).setText("col three, row three");

    PdfOptions options = PdfOptions.create();
    PdfConverter.getInstance().convert(document, out, options);
    out.close();
    return out;
}

and the code that calls this is:

调用它的代码是:

    public ResponseEntity<Resource> convertToPDFPost(@ApiParam(value = "DTOs passed from the FE" ,required=true )  @Valid @RequestBody ExportEnvelopeDTO exportDtos) {

        if (exportDtos.getProdExportDTOs() != null) {
            try {
                FileOutputStream out = new FileOutputStream("/Users/kornhaus/Desktop/test.pdf");
                out.write(exporter.export().toByteArray());
                out.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
            return new ResponseEntity<Resource>(responseFile, responseHeaders, HttpStatus.OK);
        }

        return new ResponseEntity<Resource>(HttpStatus.INTERNAL_SERVER_ERROR);
    }

}

On this line here: out.write(exporter.export().toByteArray()); the code throws an exception:

在这里的这一行:out.write(exporter.export().toByteArray()); 代码抛出异常:

org.apache.poi.xwpf.converter.core.XWPFConverterException: java.io.IOException: Unable to parse xml bean

I have no clue what's causing this, where to even look for this kind of documentation. I have been coding a decade plus and never had such difficulty with what should be a simple Java library. Any help would be great.

我不知道是什么导致了这种情况,甚至不知道在哪里寻找这种文档。我已经编码了十多年,从来没有遇到过应该是一个简单的 Java 库的困难。任何帮助都会很棒。

回答by Axel Richter

The main problem with this is that those PdfOptionsand PdfConverterare not part of the apache poiproject. They are developed by opensagresand first versions were badly named org.apache.poi.xwpf.converter.pdf.PdfOptionsand org.apache.poi.xwpf.converter.pdf.PdfConverter. Those old classes were not updated since 2014 and needs version 3.9of apache poito be used.

主要问题是那些PdfOptionsPdfConverter不是apache poi项目的一部分。它们是由开发的opensagres,第一个版本的名字很糟糕,org.apache.poi.xwpf.converter.pdf.PdfOptions并且org.apache.poi.xwpf.converter.pdf.PdfConverter. 这些老班没有更新从2014年开始,需要版本3.9apache poi使用。

But the same developers provide fr.opensagres.poi.xwpf.converter.pdf, which is much more current and works using the latest stable release apache poi 3.17. So we should using this.

但是同样的开发人员提供了fr.opensagres.poi.xwpf.converter.pdf,它是最新的并且使用最新的稳定版本工作apache poi 3.17。所以我们应该使用这个。

But since even those newer PdfOptionsand PdfConverterare not part of the apache poiproject, apache poiwill not testing those with their releases. And so the default *.docxdocuments created by apache poilacks some content which PdfConverterneeds.

但是因为即使是那些较新的PdfOptions并且PdfConverter不是apache poi项目的一部分,apache poi也不会用他们的版本来测试那些。因此*.docx创建的默认文档apache poi缺少一些PdfConverter需要的内容。

  1. There must be a styles document, even if it is empty.

  2. There must be section properties for the page having at least the page size set.

  3. Tables must have a table grid set.

  1. 必须有一个样式文件,即使它是空的。

  2. 页面必须有至少设置页面大小的部分属性。

  3. 表格必须有表格网格集。

To fulfilling this we must add some code additionally in our program. Unfortunately this then needs the full jar of all of the schemas ooxml-schemas-1.3.jaras mentioned in Faq-N10025.

为了实现这一点,我们必须在我们的程序中额外添加一些代码。不幸的是,这需要Faq-N10025 中ooxml-schemas-1.3.jar提到的所有模式的完整 jar 。

And because we need changing the underlaying low level objects, the document must be written so underlaying objects will be committed. Else the XWPFDocumentwhich we hand over the PdfConverterwill be incomplete.

并且因为我们需要更改底层对象,所以必须编写文档以便提交底层对象。否则XWPFDocument我们交出的PdfConverter将是不完整的。

Example:

例子:

import java.io.*;
import java.math.BigInteger;

//needed jars: fr.opensagres.poi.xwpf.converter.core-2.0.1.jar, 
//             fr.opensagres.poi.xwpf.converter.pdf-2.0.1.jar,
//             fr.opensagres.xdocreport.itext.extension-2.0.1.jar,
//             itext-2.1.7.jar                                  
import fr.opensagres.poi.xwpf.converter.pdf.PdfOptions;
import fr.opensagres.poi.xwpf.converter.pdf.PdfConverter;

//needed jars: apache poi and it's dependencies
//             and additionally: ooxml-schemas-1.3.jar 
import org.apache.poi.xwpf.usermodel.*;
import org.apache.poi.util.Units;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.*;

public class XWPFToPDFConverterSampleMin {

 public static void main(String[] args) throws Exception {

  XWPFDocument document = new XWPFDocument();

  // there must be a styles document, even if it is empty
  XWPFStyles styles = document.createStyles();

  // there must be section properties for the page having at least the page size set
  CTSectPr sectPr = document.getDocument().getBody().addNewSectPr();
  CTPageSz pageSz = sectPr.addNewPgSz();
  pageSz.setW(BigInteger.valueOf(12240)); //12240 Twips = 12240/20 = 612 pt = 612/72 = 8.5"
  pageSz.setH(BigInteger.valueOf(15840)); //15840 Twips = 15840/20 = 792 pt = 792/72 = 11"

  // filling the body
  XWPFParagraph paragraph = document.createParagraph();

  //create table
  XWPFTable table = document.createTable();

  //create first row
  XWPFTableRow tableRowOne = table.getRow(0);
  tableRowOne.getCell(0).setText("col one, row one");
  tableRowOne.addNewTableCell().setText("col two, row one");
  tableRowOne.addNewTableCell().setText("col three, row one");

  //create CTTblGrid for this table with widths of the 3 columns. 
  //necessary for Libreoffice/Openoffice and PdfConverter to accept the column widths.
  //values are in unit twentieths of a point (1/1440 of an inch)
  //first column = 2 inches width
  table.getCTTbl().addNewTblGrid().addNewGridCol().setW(BigInteger.valueOf(2*1440));
  //other columns (2 in this case) also each 2 inches width
  for (int col = 1 ; col < 3; col++) {
   table.getCTTbl().getTblGrid().addNewGridCol().setW(BigInteger.valueOf(2*1440));
  }

  //create second row
  XWPFTableRow tableRowTwo = table.createRow();
  tableRowTwo.getCell(0).setText("col one, row two");
  tableRowTwo.getCell(1).setText("col two, row two");
  tableRowTwo.getCell(2).setText("col three, row two");

  //create third row
  XWPFTableRow tableRowThree = table.createRow();
  tableRowThree.getCell(0).setText("col one, row three");
  tableRowThree.getCell(1).setText("col two, row three");
  tableRowThree.getCell(2).setText("col three, row three");

  paragraph = document.createParagraph();

  //trying picture
  XWPFRun run = paragraph.createRun();
  run.setText("The picture in line: ");
  InputStream in = new FileInputStream("samplePict.jpeg");
  run.addPicture(in, Document.PICTURE_TYPE_JPEG, "samplePict.jpeg", Units.toEMU(100), Units.toEMU(30));
  in.close();  
  run.setText(" text after the picture.");

  paragraph = document.createParagraph();

  //document must be written so underlaaying objects will be committed
  ByteArrayOutputStream out = new ByteArrayOutputStream();
  document.write(out);
  document.close();

  document = new XWPFDocument(new ByteArrayInputStream(out.toByteArray()));
  PdfOptions options = PdfOptions.create();
  PdfConverter converter = (PdfConverter)PdfConverter.getInstance();
  converter.convert(document, new FileOutputStream("XWPFToPDFConverterSampleMin.pdf"), options);

  document.close();

 }
}


Using XDocReport

使用 XDocReport

Another way would be using the newest version of opensagres/xdocreportas described in Converter only with ConverterRegistry:

另一种方法是使用最新版本的opensagres/xdocreport,如Converter only with ConverterRegistry 中所述

import java.io.*;
import java.math.BigInteger;

//needed jars: xdocreport-2.0.1.jar, 
//             odfdom-java-0.8.7.jar,
//             itext-2.1.7.jar  
import fr.opensagres.xdocreport.converter.Options;
import fr.opensagres.xdocreport.converter.IConverter;
import fr.opensagres.xdocreport.converter.ConverterRegistry;
import fr.opensagres.xdocreport.converter.ConverterTypeTo;
import fr.opensagres.xdocreport.core.document.DocumentKind;

//needed jars: apache poi and it's dependencies
//             and additionally: ooxml-schemas-1.3.jar 
import org.apache.poi.xwpf.usermodel.*;
import org.apache.poi.util.Units;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.*;

public class XWPFToPDFXDocReport {

 public static void main(String[] args) throws Exception {

  XWPFDocument document = new XWPFDocument();

  // there must be a styles document, even if it is empty
  XWPFStyles styles = document.createStyles();

  // there must be section properties for the page having at least the page size set
  CTSectPr sectPr = document.getDocument().getBody().addNewSectPr();
  CTPageSz pageSz = sectPr.addNewPgSz();
  pageSz.setW(BigInteger.valueOf(12240)); //12240 Twips = 12240/20 = 612 pt = 612/72 = 8.5"
  pageSz.setH(BigInteger.valueOf(15840)); //15840 Twips = 15840/20 = 792 pt = 792/72 = 11"

  // filling the body
  XWPFParagraph paragraph = document.createParagraph();

  //create table
  XWPFTable table = document.createTable();

  //create first row
  XWPFTableRow tableRowOne = table.getRow(0);
  tableRowOne.getCell(0).setText("col one, row one");
  tableRowOne.addNewTableCell().setText("col two, row one");
  tableRowOne.addNewTableCell().setText("col three, row one");

  //create CTTblGrid for this table with widths of the 3 columns. 
  //necessary for Libreoffice/Openoffice and PdfConverter to accept the column widths.
  //values are in unit twentieths of a point (1/1440 of an inch)
  //first column = 2 inches width
  table.getCTTbl().addNewTblGrid().addNewGridCol().setW(BigInteger.valueOf(2*1440));
  //other columns (2 in this case) also each 2 inches width
  for (int col = 1 ; col < 3; col++) {
   table.getCTTbl().getTblGrid().addNewGridCol().setW(BigInteger.valueOf(2*1440));
  }

  //create second row
  XWPFTableRow tableRowTwo = table.createRow();
  tableRowTwo.getCell(0).setText("col one, row two");
  tableRowTwo.getCell(1).setText("col two, row two");
  tableRowTwo.getCell(2).setText("col three, row two");

  //create third row
  XWPFTableRow tableRowThree = table.createRow();
  tableRowThree.getCell(0).setText("col one, row three");
  tableRowThree.getCell(1).setText("col two, row three");
  tableRowThree.getCell(2).setText("col three, row three");

  paragraph = document.createParagraph();

  //trying picture
  XWPFRun run = paragraph.createRun();
  run.setText("The picture in line: ");
  InputStream in = new FileInputStream("samplePict.jpeg");
  run.addPicture(in, Document.PICTURE_TYPE_JPEG, "samplePict.jpeg", Units.toEMU(100), Units.toEMU(30));
  in.close();  
  run.setText(" text after the picture.");

  paragraph = document.createParagraph();

  //document must be written so underlaaying objects will be committed
  ByteArrayOutputStream out = new ByteArrayOutputStream();
  document.write(out);
  document.close();

  // 1) Create options DOCX 2 PDF to select well converter form the registry
  Options options = Options.getFrom(DocumentKind.DOCX).to(ConverterTypeTo.PDF);

  // 2) Get the converter from the registry
  IConverter converter = ConverterRegistry.getRegistry().getConverter(options);

  // 3) Convert DOCX 2 PDF
  InputStream docxin= new ByteArrayInputStream(out.toByteArray());
  OutputStream pdfout = new FileOutputStream(new File("XWPFToPDFXDocReport.pdf"));
  converter.convert(docxin, pdfout, options);

  docxin.close();       
  pdfout.close();       

 }
}


October 2018: This code works using apache poi 3.17. It cannot work using apache poi 4.0.0due to changings in apache poiwhich were not taken in account until now in fr.opensagres.poi.xwpf.converteras well as in fr.opensagres.xdocreport.converter.

2018 年 10 月:此代码使用apache poi 3.17. 使用它不能工作,apache poi 4.0.0由于changingsapache poi其未在帐户中取到现在为止fr.opensagres.poi.xwpf.converter,以及在fr.opensagres.xdocreport.converter



February 2019: Works for me now using the newest apache poiversion 4.0.1and the newest version 2.0.2of fr.opensagres.poi.xwpf.converter.coreand consorts.

2019 年 2 月:现在使用最新apache poi版本4.0.1和最新版本2.0.2fr.opensagres.poi.xwpf.converter.coreconsorts对我有用