bash 用数字排序合并pdf文件

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/23643274/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 10:26:04  来源:igfitidea点击:

Merge pdf files with numerical sort

linuxbashsortingpdfnumeric

提问by max

I am trying to write a bash script to merge all pdf files of a directory into one single pdf file. The command pdfunite *.pdf output.pdfsuccessfully achieves this but it merges the input documents in a regular order:

我正在尝试编写一个 bash 脚本来将目录的所有 pdf 文件合并为一个 pdf 文件。该命令pdfunite *.pdf output.pdf成功实现了这一点,但它按常规顺序合并输入文档:

1.pdf
10.pdf
11.pdf
2.pdf
3.pdf
4.pdf
5.pdf
6.pdf
7.pdf
8.pdf
9.pdf

while I'd like the documents to be merged in a numerical order:

虽然我希望按数字顺序合并文档:

1.pdf
2.pdf
3.pdf
4.pdf
5.pdf
6.pdf
7.pdf
8.pdf
9.pdf
10.pdf
11.pdf

I guess a command mixing ls -vor sort -nand pdfunitewould do the trick but I don't know how to combine them. Any idea on how I could merge pdf files with a numerical sort?

我猜一个命令混合ls -vsort -npdfunite会做的伎俩,但我不知道如何组合它们。关于如何将 pdf 文件与数字排序合并的任何想法

回答by ymonad

you can embed the result of command using $(), so you can do following

您可以使用 嵌入命令的结果$(),因此您可以执行以下操作

$ pdfunite $(ls -v *.pdf) output.pdf

or

或者

$ pdfunite $(ls *.pdf | sort -n) output.pdf

However, note that this does not work when filename contains special character such as whitespace.

但是,请注意,当文件名包含特殊字符(如空格)时,这不起作用。

In the case you can do the following:

在这种情况下,您可以执行以下操作:

ls -v *.txt | bash -c 'IFS=$'"'"'\n'"'"' read -d "" -ra x;pdfunite "${x[@]}" output.pdf'

Although it seems a little bit complicated, its just combination of

虽然看起来有点复杂,但它只是组合

Note that you cannot use xargssince pdfuniterequires input pdf's as the middle of arguments. I avoided using readarraysince it is not supported in older bash version, but you can use it instead of IFS=.. read -ra ..if you have newer bash.

请注意,您不能使用,xargs因为pdfunite需要输入 pdf 作为参数的中间。我避免使用readarray它,因为它在较旧的 bash 版本中不受支持,但是IFS=.. read -ra ..如果您有较新的bash.

回答by infoclogged

Do it in multiple steps. I am assuming you have files from 1 to 99.

分多个步骤进行。我假设你有从 1 到 99 的文件。

 pdfunite $(find ./ -regex ".*[^0-9][0-9][^0-9].*"  | sort) out1.pdf
 pdfunite out1.pdf $(find ./ -regex ".*[^0-9]1[0-9][^0-9].*"  | sort) out2.pdf
 pdfunite out2.pdf $(find ./ -regex ".*[^0-9]2[0-9][^0-9].*"  | sort) out3.pdf

and so on.

等等。

the final file will consist of all your pdfs in numerical order.

最终文件将按数字顺序包含您所有的 pdf。

!!! Beware of writing the output file such as out1.pdf etc. otherwise pdfunite will overwrite the last file !!!

!!!谨防写入out1.pdf等输出文件,否则pdfunite会覆盖最后一个文件!!!

Edit: Sorry I was missing the [^0-9] in each regex. Corrected it in the above commands.

编辑:对不起,我错过了每个正则表达式中的 [^0-9]。在上面的命令中更正了它。