查找文件并压缩它们(带空格)

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/5891866/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-04 00:49:14  来源:igfitidea点击:

Find files and tar them (with spaces)

linuxfindbackuptar

提问by Caleb Kester

Alright, so simple problem here. I'm working on a simple back up code. It works fine except if the files have spaces in them. This is how I'm finding files and adding them to a tar archive:

好吧,这么简单的问题就在这里。我正在编写一个简单的备份代码。它工作正常,除非文件中有空格。这就是我查找文件并将它们添加到 tar 存档的方式:

find . -type f | xargs tar -czvf backup.tar.gz 

The problem is when the file has a space in the name because tar thinks that it's a folder. Basically is there a way I can add quotes around the results from find? Or a different way to fix this?

问题是文件名中有空格,因为 tar 认为它是一个文件夹。基本上有没有一种方法可以在 find 的结果周围添加引号?或者用不同的方法来解决这个问题?

采纳答案by Steve Kehlet

Use this:

用这个:

find . -type f -print0 | tar -czvf backup.tar.gz --null -T -

It will:

它会:

  • deal with files with spaces, newlines, leading dashes, and other funniness
  • handle an unlimited number of files
  • won't repeatedly overwrite your backup.tar.gz like using tar -cwith xargswill do when you have a large number of files
  • 处理带有空格、换行符、前导破折号和其他有趣内容的文件
  • 处理无限数量的文件
  • 当您有大量文件时tar -cxargs不会像使用with那样反复覆盖您的 backup.tar.gz

Also see:

另见:

回答by Warren P

Why not:

为什么不:

tar czvf backup.tar.gz *

Sure it's clever to use find and then xargs, but you're doing it the hard way.

当然,先使用 find 再使用 xargs 很聪明,但你这样做很困难。

Update: Porges has commented with a find-option that I think is a better answer than my answer, or the other one: find -print0 ... | xargs -0 ....

更新:Porges 评论了一个查找选项,我认为它比我的答案或另一个答案更好: find -print0 ... | xargs -0 ....

回答by gsteff

Try running:

尝试运行:

    find . -type f | xargs -d "\n" tar -czvf backup.tar.gz 

回答by errorprone

There could be another way to achieve what you want. Basically,

可能有另一种方式来实现你想要的。基本上,

  1. Use the findcommand to output path to whatever files you're looking for. Redirect stdoutto a filename of your choosing.
  2. Then tar with the -T option which allows it to take a list of file locations (the one you just created with find!)

    find . -name "*.whatever" > yourListOfFiles
    tar -cvf yourfile.tar -T yourListOfFiles
    
  1. 使用find命令输出您要查找的任何文件的路径。将标准输出重定向到您选择的文件名。
  2. 然后使用 -T 选项 tar 允许它获取文件位置列表(您刚刚使用 find 创建的位置!)

    find . -name "*.whatever" > yourListOfFiles
    tar -cvf yourfile.tar -T yourListOfFiles
    

回答by Nux

The best solution seem to be to create a file list and then archive files because you can use other sources and do something else with the list.

最好的解决方案似乎是创建一个文件列表,然后归档文件,因为您可以使用其他来源并对该列表执行其他操作。

For example this allows using the list to calculate size of the files being archived:

例如,这允许使用列表来计算正在归档的文件的大小:

#!/bin/sh

backupFileName="backup-big-$(date +"%Y%m%d-%H%M")"
backupRoot="/var/www"
backupOutPath=""

archivePath=$backupOutPath$backupFileName.tar.gz
listOfFilesPath=$backupOutPath$backupFileName.filelist

#
# Make a list of files/directories to archive
#
echo "" > $listOfFilesPath
echo "${backupRoot}/uploads" >> $listOfFilesPath
echo "${backupRoot}/extra/user/data" >> $listOfFilesPath
find "${backupRoot}/drupal_root/sites/" -name "files" -type d >> $listOfFilesPath

#
# Size calculation
#
sizeForProgress=`
cat $listOfFilesPath | while read nextFile;do
    if [ ! -z "$nextFile" ]; then
        du -sb "$nextFile"
    fi
done | awk '{size+=} END {print size}'
`

#
# Archive with progress
#
## simple with dump of all files currently archived
#tar -czvf $archivePath -T $listOfFilesPath
## progress bar
sizeForShow=$(($sizeForProgress/1024/1024))
echo -e "\nRunning backup [source files are $sizeForShow MiB]\n"
tar -cPp -T $listOfFilesPath | pv -s $sizeForProgress | gzip > $archivePath

回答by Kalibur x

If you have multiple files or directories and you want to zip them into independent *.gzfile you can do this. Optional -type f -atime

如果您有多个文件或目录,并且想要将它们压缩为独立*.gz文件,则可以执行此操作。可选的-type f -atime

find -name "httpd-log*.txt" -type f -mtime +1 -exec tar -vzcf {}.gz {} \;

This will compress

这将压缩

httpd-log01.txt
httpd-log02.txt

to

httpd-log01.txt.gz
httpd-log02.txt.gz

回答by Frank Eggink

Why not give something like this a try: tar cvf scala.tar `find src -name *.scala`

为什么不尝试这样的事情: tar cvf scala.tar `find src -name *.scala`

回答by tommy.carstensen

Another solution as seen here:

如看到另一种解决方案在这里

find var/log/ -iname "anaconda.*" -exec tar -cvzf file.tar.gz {} +

回答by user3472383

Would add a comment to @Steve Kehlet postbut need 50 rep (RIP).

将在@Steve Kehlet 帖子中添加评论,但需要 50 个代表(RIP)。

For anyone that has found this post through numerous googling, I found a way to not only find specific files given a time range, but also NOT include the relative paths OR whitespaces that would cause tarring errors. (THANK YOU SO MUCH STEVE.)

对于通过大量谷歌搜索找到这篇文章的任何人,我找到了一种方法,不仅可以找到给定时间范围的特定文件,而且还可以不包含会导致 tarring 错误的相对路径或空格。(非常感谢史蒂夫。)

find . -name "*.pdf" -type f -mtime 0 -printf "%f
tar -czvf /archiveDir/test.tar.gz --newer-mtime=0 --ignore-failed-read *.pdf
" | tar -czvf /dir/zip.tar.gz --null -T -
  1. .relative directory

  2. -name "*.pdf"look for pdfs (or any file type)

  3. -type ftype to look for is a file

  4. -mtime 0look for files created in last 24 hours

  5. -printf "%f\0"Regular -print0OR -printf "%f"did NOT work for me. From man pages:

  1. .相对目录

  2. -name "*.pdf"寻找 pdf(或任何文件类型)

  3. -type f要查找的类型是文件

  4. -mtime 0查找过去 24 小时内创建的文件

  5. -printf "%f\0"常规-print0OR-printf "%f"对我不起作用。从手册页:

This quoting is performed in the same way as for GNU ls. This is not the same quoting mechanism as the one used for -ls and -fls. If you are able to decide what format to use for the output of find then it is normally better to use '\0' as a terminator than to use newline, as file names can contain white space and newline characters.

此引用的执行方式与 GNU ls 相同。这与用于 -ls 和 -fls 的引用机制不同。如果您能够决定使用什么格式用于 find 的输出,那么使用 '\0' 作为终止符通常比使用换行符更好,因为文件名可以包含空格和换行符。

  1. -czvfcreate archive, filter the archive through gzip , verbosely list files processed, archive name
  1. -czvf创建存档,通过 gzip 过滤存档,详细列出已处理的文件,存档名称

Edit 2019-08-14: I would like to add, that I was also able to use essentially use the same command in my comment, just using tar itself:

编辑 2019-08-14:我想补充一点,我也可以在我的评论中使用基本相同的命令,只需使用 tar 本身:

##代码##

Needed --ignore-failed-readin-case there were no new PDFs for today.

需要--ignore-failed-read在情况有今天没有新的PDF文件。