Java pdfBox:填写pdf表单,将其附加到pddocument,然后重复

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/29371129/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-02 15:07:51  来源:igfitidea点击:

Java pdfBox: Fill out pdf form, append it to pddocument, and repeat

javapdfpdfboxpdf-form

提问by Andrew

I have a pdf form made and I'm trying to use pdfBox to fill in the form and print the document. I got it working great for 1 page print jobs but i had to try and modify for multiple pages. Basically it's a form with basic info up top and a list of contents. Well if the contents are larger than what the form has room for I have to make it a multiple page document. I end up with a document with a nice page one and then all the remaining pages are the blank template. What am I doing wrong?

我制作了一个 pdf 表格,我正在尝试使用 pdfBox 填写表格并打印文档。我让它适用于 1 页打印作业,但我不得不尝试修改多页。基本上它是一个包含基本信息和内容列表的表单。好吧,如果内容大于表单的空间,我必须将其设为多页文档。我最终得到了一个带有漂亮页面的文档,然后所有剩余的页面都是空白模板。我究竟做错了什么?

PDDocument finalDoc = new PDDocument();
File template = new File("path/to/template.pdf");

//Declare basic info to be put on every page
String name = "John Smith";
String phoneNum = "555-555-5555";
//Get list of contents for each page
List<List<Map<String, String>>> pageContents = methodThatReturnsMyInfo();

for (List<Map<String, String>> content : pageContents) {
    PDDocument doc = new PDDocument().load(template);
    PDDocumentCatlog docCatalog = doc.getDocumentCatalog();
    PDAcroForm acroForm = docCatalog.getAcroForm();

    acroForm.getField("name").setValue(name);
    acroForm.getField("phoneNum").setValue(phoneNum);

    for (int i=0; i<content.size(); i++) {
        acroForm.getField("qty"+i).setValue(content.get(i).get("qty"));
        acroForm.getField("desc"+i).setValue(content.get(i).get("desc"));
    }

    List<PDPage> pages = docCatalog.getAllPages();
    finalDoc.addPage(pages.get(0));
}

//Then prints/saves finalDoc

回答by mkl

There are two major issues in you code:

您的代码中有两个主要问题:

  • The AcroForm element of a PDF is a document level object. You only copy the filled-in template page into finalDoc. Thus, the form fields are added to finalDoconly as annotations of their respective page but they are not added to the AcroForm of finalDoc.

    This is not apparent in Adobe Reader but form filling services often identify available fields from the document level AcroForm entry and don't search the pages for additional form fields.

  • The actual show stopper:You add fields with identical names to the PDF. But PDF forms are document-wide entities. I.e. there can be only a single field entity with a given name in a PDF.(This field entity may have multiple visualizations aka widgets but this requires you to construct a single field object with multiple kid widgets.Furthermore these widgets are expected to display the same value which is not what you want...)

    Thus, you have to rename the fields uniquely before adding them to the finalDoc.

  • PDF 的 AcroForm 元素是文档级对象。您只需将填写好的模板页面复制到finalDoc. 因此,表单字段finalDoc仅作为其各自页面的注释添加到,但不会添加到finalDoc.

    这在 Adob​​e Reader 中并不明显,但表单填写服务通常会从文档级别的 AcroForm 条目中识别可用字段,并且不会在页面中搜索其他表单字段。

  • 真正的表演障碍:您向 PDF 添加具有相同名称的字段。但 PDF 表单是文档范围的实体。即在 PDF 中只能有一个具有给定名称的字段实体。(这个字段实体可能有多个可视化,也就是小部件,但这需要你用多个孩子小部件构造一个字段对象。此外,这些小部件预计会显示相同的值,这不是你想要的......)

    因此,您必须在将字段添加到finalDoc.

Here a simplified examplewhich works on a template with only one field "SampleField":

这是一个简单的例子,它适用于只有一个字段“SampleField”的模板:

byte[] template = generateSimpleTemplate();
Files.write(new File(RESULT_FOLDER,  "template.pdf").toPath(), template);

try (   PDDocument finalDoc = new PDDocument(); )
{
    List<PDField> fields = new ArrayList<PDField>();
    int i = 0;

    for (String value : new String[]{"eins", "zwei"})
    {
        PDDocument doc = new PDDocument().load(new ByteArrayInputStream(template));
        PDDocumentCatalog docCatalog = doc.getDocumentCatalog();
        PDAcroForm acroForm = docCatalog.getAcroForm();
        PDField field = acroForm.getField("SampleField");
        field.setValue(value);
        field.setPartialName("SampleField" + i++);
        List<PDPage> pages = docCatalog.getAllPages();
        finalDoc.addPage(pages.get(0));
        fields.add(field);
    }

    PDAcroForm finalForm = new PDAcroForm(finalDoc);
    finalDoc.getDocumentCatalog().setAcroForm(finalForm);
    finalForm.setFields(fields);

    finalDoc.save(new File(RESULT_FOLDER, "form-two-templates.pdf"));
}

As you see all fields are renamed before they are added to finalForm:

如您所见,所有字段在添加到 之前都已重命名finalForm

field.setPartialName("SampleField" + i++);

and they are collected in the list fieldswhich finally is added to the finalFormAcroForm:

并将它们收集在fields最终添加到finalFormAcroForm的列表中:

    fields.add(field);
}
...
finalForm.setFields(fields);