php pdftk 将 pdf 拆分为多页但总大小增加

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/19991144/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-25 20:19:55  来源:igfitidea点击:

pdftk split pdf with multiple pages but total size grew

phppdfsplitpdftk

提问by Simone M

with php i have to split a single pdf file with multiple pages inside it to a lot of PDF file with one page per file. I use pdftk and works fine, but every pdf created for every page is very large size. My original PDF is 7MB (with 70pages inside), the sum of every file created by splitting with pdftk is over 70MB.

使用 php,我必须将其中包含多个页面的单个 pdf 文件拆分为多个 PDF 文件,每个文件一页。我使用 pdftk 并且工作正常,但是为每个页面创建的每个 pdf 都非常大。我的原始PDF是7MB(里面有70页),用pdftk分割创建的每个文件的总和超过70MB。

Someone know if there is a property to set for pdftk to have small file output?

有人知道是否有设置 pdftk 的属性来输出小文件?

回答by pobrelkey

You could always specify the compressoption - for example:

您始终可以指定compress选项 - 例如:

pdftk input.pdf burst output output_%02d.pdf compress

Note that pdftkjust copies the content of your PDF files from the inputs into the outputs, and can't do very much to optimize away bloat. So if your input PDFs are large/complicated, your output PDFs will be also. Also note that any fonts embedded in the document may end up being duplicated in each output document, taking up more space.

请注意,pdftk只是将 PDF 文件的内容从输入复制到输出中,并不能做太多优化以消除膨胀。因此,如果您的输入 PDF 很大/很复杂,那么您的输出 PDF 也会如此。另请注意,文档中嵌入的任何字体最终可能会在每个输出文档中复制,从而占用更多空间。

回答by Kungfu_panda

You may use pdftkand try

您可以使用pdftk并尝试

pdftk source.pdf cat 1-100 output try1.pdf
pdftk source.pdf cat 101-end output try2.pdf

回答by johnwhitington

When splitting PDF files, it's sometimes hard to avoid information which is only required by some pages being included in each output file.

拆分 PDF 文件时,有时很难避免每个输出文件中只包含某些页面所需的信息。

cpdftries hard to avoid this -- you can try it and see what happens. You might find it's no better than pdftk on your file, but it should be.

cpdf努力避免这种情况——您可以尝试一下,看看会发生什么。您可能会发现它并不比文件上的 pdftk 好,但它应该是。

Disclosure: I am the author of cpdf.

披露:我是 cpdf 的作者。

回答by Hans-J.

Had a similar problem. But does not apply 1:1 to the question. Anyways somebody might find it useful:

有类似的问题。但不适用于 1:1 的问题。无论如何,有人可能会发现它很有用:

  1. I had a very big pdf file - original.pdf- of more than 240MB. It was almost impossible to use it. I printed it out with evinceas a pdf and removed any scaling in the printer setup. This generated a file - new.pdf- of around 102MB! Obviously all the embedded fonts, bookmarks and so on were removed.
  2. To get the bookmarks back I used cpdfto extract the bookmarks from the original pdf document and applied it to the new one. The resulting document - result.pdf- is easy to navigate and very quick in any pdf viewer.
  1. 我有一个非常大的 pdf 文件 - original.pdf- 超过 240MB。几乎不可能使用它。我用evince作为 pdf打印出来,并删除了打印机设置中的任何缩放。这生成了一个大约 102MB的文件 - new.pdf!显然所有嵌入的字体、书签等都被删除了。
  2. 为了取回书签,我使用cpdf从原始 pdf 文档中提取书签并将其应用于新文档。生成的文档 - result.pdf- 在任何 pdf 查看器中都易于导航且速度非常快。

Reference: cpdf to extract and apply bookmarks: http://www.coherentpdf.com/cpdfmanual/node38.html

参考:cpdf 提取和应用书签:http: //www.coherentpdf.com/cpdfmanual/node38.html

cpdf -list-bookmarks original.pdf > booksmarks.txt
cpdf -add-bookmarks booksmarks.txt new.pdf -o result.pdf