bash 如何使用 PDFTK(或其他命令行应用程序)查找和替换现有 PDF 文件中的文本

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/9871585/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 21:51:16  来源:igfitidea点击:

How to find and replace text in a existing PDF file with PDFTK (or other command line application)

bashpdfpdftk

提问by Roger

I have on each page of my PDF document a line with this string:

我在 PDF 文档的每一页上都有一行包含以下字符串:

%REPLACE%

%REPLACE%

Which I'd like to find and replace with another string.

我想找到并替换为另一个字符串。

Does anyone know how to do this with some command line application such as PDFTK?

有谁知道如何使用某些命令行应用程序(例如 PDFTK)执行此操作?

This folkgave me an important clue however I'd like something more direct.

这种民间给了我,但是我想一些更直接的一个重要线索。

Thanks.

谢谢。

回答by Dingo

You can try to modify content of your PDF as follows

您可以尝试按如下方式修改 PDF 的内容

  1. Uncompress the text streams of PDF

    pdftk file.pdf output uncompressed.pdf uncompress
    
  2. Use sedto replace your text with another

    sed -e "s/ORIGINALSTRING/NEWSTRING/g" <uncompressed.pdf >modified.pdf
    
  3. If this attempt was successful, re-compress the PDF with pdftk

    pdftk modified.pdf output recompressed.pdf compress
    
  1. 解压缩 PDF 的文本流

    pdftk file.pdf output uncompressed.pdf uncompress
    
  2. 使用sed将您的文本替换为另一个

    sed -e "s/ORIGINALSTRING/NEWSTRING/g" <uncompressed.pdf >modified.pdf
    
  3. 如果此尝试成功,请使用pdftk重新压缩 PDF

    pdftk modified.pdf output recompressed.pdf compress
    

Note:This way is not successful every time, mainly due to font subsetting

注意:这种方式不是每次都成功,主要是字体子集