Python 如何使用 PyPDF2 附加 PDF 页面
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/22795091/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to append PDF pages using PyPDF2
提问by Valentin Melnikov
Is anybody has experience merging two page of PDF file into one using python lib PyPDF2.
When I try page1.mergePage(page2)it results with page2 overlayed page1. How to make it to add page2 to the bottom of the page1?
有没有人有使用 python lib PyPDF2 将两页 PDF 文件合并为一页的经验?当我尝试page1.mergePage(page2)它的结果是 page2 覆盖了 page1。如何使它将page2添加到page1的底部?
回答by user3482598
The code posted in this following link accomplished your objective.
以下链接中发布的代码实现了您的目标。
Using PyPDF2 to merge files into multiple output files
I believe the trick is:
我相信诀窍是:
merger.append(input)
合并.追加(输入)
回答by Emile Bergeron
As I'm searching the web for python pdf merging solution, I noticed that there's a general misconception with merging versus appending.
当我在网上搜索 python pdf 合并解决方案时,我注意到合并与附加存在普遍的误解。
Most people call the appending action a merge but it's not. What you're describing in your question is really the intended use of mergePagewhich should be called applyPageOnTopOfAnotherbut that's a little long.What you are (were) looking for is really appending two files/pages into a new file.
大多数人将附加操作称为合并,但事实并非如此。您在问题中所描述的实际上是mergePage应调用 whichapplyPageOnTopOfAnother的预期用途,但这有点长。您(正在)寻找的是将两个文件/页面附加到一个新文件中。
Appending PDF files
附加 PDF 文件
Using the PdfFileMergerclass and its appendmethod.
使用PdfFileMerger类及其append方法。
Identical to the
merge()method, but assumes you want to concatenate all pages onto the end of the file instead of specifying a position.
与该
merge()方法相同,但假设您希望将所有页面连接到文件末尾而不是指定位置。
Here's one way to do it taken from pypdf Merging multiple pdf files into one pdf:
这是从pypdf Merging multiple pdf files into one pdf 中获取的一种方法:
from PyPDF2 import PdfFileMerger, PdfFileReader
# ...
merger = PdfFileMerger()
merger.append(PdfFileReader(file(filename1, 'rb')))
merger.append(PdfFileReader(file(filename2, 'rb')))
merger.write("document-output.pdf")
Appending specific PDF pages
附加特定的 PDF 页面
And to append specific pages of different PDF files, use the PdfFileWriterclass with the addPagemethod.
并且要附加不同 PDF 文件的特定页面,请使用PdfFileWriter带有addPage方法的类。
Adds a page to this PDF file. The page is usually acquired from a
PdfFileReaderinstance.
向此 PDF 文件添加页面。该页面通常是从
PdfFileReader实例中获取的 。
file1 = PdfFileReader(file(filename1, "rb"))
file2 = PdfFileReader(file(filename2, "rb"))
output = PdfFileWriter()
output.addPage(file1.getPage(specificPageIndex))
output.addPage(file2.getPage(specificPageIndex))
outputStream = file("document-output.pdf", "wb")
output.write(outputStream)
outputStream.close()
Merging two pages into one page
将两页合并为一页
Using mergePage
使用 mergePage
Merges the content streams of two pages into one. Resource references (i.e. fonts) are maintained from both pages. The mediabox/cropbox/etc of this page are not altered. The parameter page's content stream will be added to the end of this page's content stream, meaning that it will be drawn after, or “on top”of this page.
将两个页面的内容流合并为一个。两个页面都维护资源引用(即字体)。此页面的媒体框/裁剪框/等未更改。参数页面的内容流将被添加到此页面的内容流的末尾,这意味着它将在此页面之后或“顶部”绘制。
file1 = PdfFileReader(file(filename1, "rb"))
file2 = PdfFileReader(file(filename2, "rb"))
output = PdfFileWriter()
page = file1.getPage(specificPageIndex)
page.mergePage(file2.getPage(specificPageIndex))
output.addPage(page)
outputStream = file("document-output.pdf", "wb")
output.write(outputStream)
outputStream.close()
回答by Patrick Maupin
The pdfrwlibrary can do this. There is a 4up example in the examples directory that places 4 input pages on every output page, and a booklet example that takes 8.5x11 input and creates 11x17 output. Disclaimer -- I am the pdfrw author.
该pdfrw库可以做到这一点。examples 目录中有一个 4up 示例,它在每个输出页面上放置 4 个输入页面,还有一个小册子示例,它采用 8.5x11 输入并创建 11x17 输出。免责声明——我是 pdfrw 作者。
回答by The Aelfinn
If the 2 PDFs do not exist on your local machine, and instead are normally accessed/download via a URL (i.e. http://foo/bar.pdf& http://bar/foo.pdf), we can fetch both PDFs from remote locations and merge them together in memory in one-fell-swoop.
如果您的本地机器上不存在 2 个 PDF,而是通常通过 URL 访问/下载(即http://foo/bar.pdf& http://bar/foo.pdf),我们可以获取两个 PDF从远程位置将它们合并到内存中。
This eliminates the assumed step of downloading the PDF to begin with, and allows us to generalize beyond the simple case of both PDFs existing on disk. Specifically, it generalizes the solution to any HTTP-accessible PDF.
这消除了开始下载 PDF 的假设步骤,并允许我们在磁盘上存在两个 PDF 的简单情况之外进行概括。具体来说,它将解决方案推广到任何可通过 HTTP 访问的 PDF。
The example:
这个例子:
from PyPDF2 import PdfFileMerger, PdfFileReader
pdf_content_1 = requests.get('http://foo/bar.pdf').content
pdf_content_2 = requests.get('http://bar/foo.pdf').content
# Write to in-memory file-like buffers
pdf_buffer_1 = StringIO.StringIO().write(pdf_content_1)
pdf_buffer_2 = StringIO.StringIO().write(pdf_content_2)
pdf_merged_buffer = StringIO.StringIO()
merger = PdfFileMerger()
merger.append(PdfFileReader(pdf_buffer_1))
merger.append(PdfFileReader(pdf_buffer_2))
merger.write(pdf_merged_buffer)
# Option 1:
# Return the content of the buffer in an HTTP response (Flask example below)
response = make_response(pdf_merged_buffer.getvalue())
# Set headers so web-browser knows to render results as PDF
response.headers['Content-Type'] = 'application/pdf'
response.headers['Content-Disposition'] = \
'attachment; filename=%s.pdf' % 'Merged PDF'
return response
# Option 2: Write to disk
with open("merged_pdf.pdf", "w") as fp:
fp.write(pdf_merged_buffer.getvalue())

