删除 bash 中除最新的 X 文件之外的所有文件

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/25785/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 17:34:32  来源:igfitidea点击:

Delete all but the most recent X files in bash

bashunixscripting

提问by Matt Sheppard

Is there a simple way, in a pretty standard UNIX environment with bash, to run a command to delete all but the most recent X files from a directory?

在带有 bash 的非常标准的 UNIX 环境中,是否有一种简单的方法来运行命令以从目录中删除除最新的 X 文件之外的所有文件?

To give a bit more of a concrete example, imagine some cron job writing out a file (say, a log file or a tar-ed up backup) to a directory every hour. I'd like a way to have another cron job running which would remove the oldest files in that directory until there are less than, say, 5.

举一个更具体的例子,想象一些 cron 作业每小时将一个文件(例如,一个日志文件或一个 tar 备份)写到一个目录中。我想要一种方法来运行另一个 cron 作业,该作业将删除该目录中最旧的文件,直到少于 5 个。

And just to be clear, there's only one file present, it should never be deleted.

需要说明的是,目前只有一个文件,永远不应该删除它。

采纳答案by mklement0

The problems with the existing answers:

现有答案的问题:

  • inability to handle filenames with embedded spaces or newlines.
    • in the case of solutions that invoke rmdirectly on an unquoted command substitution (rm `...`), there's an added risk of unintended globbing.
  • inability to distinguish between files and directories (i.e., if directorieshappened to be among the 5 most recently modified filesystem items, you'd effectively retain fewerthan 5 files, and applying rmto directories will fail).
  • 无法处理带有嵌入空格或换行符的文件名。
    • 对于rm直接调用未加引号的命令替换 ( rm `...`)的解决方案,会增加意外通配的风险。
  • 无法区分文件和目录(即,如果目录恰好在 5 个最近修改过的文件系统项中,您将有效地保留少于5 个文件,并且应用rm到目录将失败)。

wnoise's answeraddresses these issues, but the solution is GNU-specific (and quite complex).

wnoise 的回答解决了这些问题,但解决方案是特定于GNU 的(并且非常复杂)。

Here's a pragmatic, POSIX-compliant solutionthat comes with only one caveat: it cannot handle filenames with embedded newlines- but I don't consider that a real-world concern for most people.

这是一个务实的、符合 POSIX 的解决方案,只有一个警告:它不能处理带有嵌入换行符的文件名- 但我不认为这对大多数人来说是一个现实世界的问题。

For the record, here's the explanation for why it's generally not a good idea to parse lsoutput: http://mywiki.wooledge.org/ParsingLs

作为记录,这里解释了为什么解析ls输出通常不是一个好主意:http: //mywiki.wooledge.org/ParsingLs

ls -tp | grep -v '/$' | tail -n +6 | xargs -I {} rm -- {}

The above is inefficient, because xargshas to invoke rmonce for eachfilename.
Your platform's xargsmay allow you to solve this problem:

以上是低效的,因为xargs必须rm每个文件名调用一次。
您的平台xargs可能允许您解决此问题:

If you have GNUxargs, use -d '\n', which makes xargsconsider each input line a separate argument, yet passes as many arguments as will fit on a command line at once:

如果你有GNUxargs,使用-d '\n',这使得xargs考虑每个输入线路分离的说法,但经过许多参数作为将适合在命令行上一次

ls -tp | grep -v '/$' | tail -n +6 | xargs -d '\n' -r rm --

-r(--no-run-if-empty) ensures that rmis not invoked if there's no input.

-r( --no-run-if-empty) 确保rm在没有输入的情况下不会被调用。

If you have BSDxargs(including on macOS), you can use -0to handle NUL-separated input, after first translating newlines to NUL(0x0) chars., which also passes (typically) all filenames at once(will also work with GNU xargs):

如果您有BSDxargs(包括在macOS 上),您可以在首先将换行符转换为( ) 字符后使用-0处理NUL分隔输入,这也(通常)一次传递所有文件名(也适用于 GNU ):NUL0x0xargs

ls -tp | grep -v '/$' | tail -n +6 | tr '\n' '
# One by one, in a shell loop (POSIX-compliant):
ls -tp | grep -v '/$' | tail -n +6 | while IFS= read -r f; do echo "$f"; done

# One by one, but using a Bash process substitution (<(...), 
# so that the variables inside the `while` loop remain in scope:
while IFS= read -r f; do echo "$f"; done < <(ls -tp | grep -v '/$' | tail -n +6)

# Collecting the matches in a Bash *array*:
IFS=$'\n' read -d '' -ra files  < <(ls -tp | grep -v '/$' | tail -n +6)
printf '%s\n' "${files[@]}" # print array elements
' | xargs -0 rm --

Explanation:

解释:

  • ls -tpprints the names of filesystem items sorted by how recently they were modified , in descending order (most recently modified items first) (-t), with directories printed with a trailing /to mark them as such (-p).
  • grep -v '/$'then weeds out directories from the resulting listing, by omitting (-v) lines that have a trailing /(/$).
    • Caveat: Since a symlink that points to a directoryis technically not itself a directory, such symlinks will notbe excluded.
  • tail -n +6skips the first 5entries in the listing, in effect returning all butthe 5 most recently modified files, if any.
    Note that in order to exclude Nfiles, N+1must be passed to tail -n +.
  • xargs -I {} rm -- {}(and its variations) then invokes on rmon all these files; if there are no matches at all, xargswon't do anything.
    • xargs -I {} rm -- {}defines placeholder {}that represents each input line as a whole, so rmis then invoked once for each input line, but with filenames with embedded spaces handled correctly.
    • --in all cases ensures that any filenames that happen to start with -aren't mistaken for optionsby rm.
  • ls -tp打印文件系统项目的名称,按最近修改时间排序,降序排列(最近修改的项目在前)(-t),目录打印有尾随/以将它们标记为这样(-p)。
  • grep -v '/$'然后从结果列表中剔除目录,方法是省略 ( -v) 行尾随/( /$)。
    • 警告:由于指向目录符号链接在技​​术上本身不是目录,因此不会排除此类符号链接。
  • tail -n +6跳过前5个的上市项目,实际上返回所有,但5个最近修改的文件,如果有的话。
    请注意,为了排除N文件,N+1必须传递给tail -n +.
  • xargs -I {} rm -- {}(及其变体)然后rm在所有这些文件上调用;如果根本没有匹配项,xargs则不会执行任何操作。
    • xargs -I {} rm -- {}定义占位符{}表示每个输入线作为一个整体,所以rm然后调用一次对于每个输入线路,但与具有嵌入空格的文件名正确处理。
    • --在任何情况下确保了发生在开始任何文件名-是不误选项通过rm


A variationon the original problem, in case the matching files need to be processed individuallyor collected in a shell array:

变化上的原始问题,在情况下,匹配的文件需要被处理单独收集在壳阵列

rm `ls -t | awk 'NR>5'`

回答by Espo

Remove all but 5 (or whatever number) of the most recent files in a directory.

删除目录中除 5 个(或任意数量)最近的文件之外的所有文件。

(ls -t|head -n 5;ls)|sort|uniq -u|xargs rm

回答by thelsdj

(ls -t|head -n 5;ls)|sort|uniq -u|sed -e 's,.*,"&",g'|xargs rm

This version supports names with spaces:

此版本支持带空格的名称:

ls -tr | head -n -5 | xargs --no-run-if-empty rm 

回答by Fabien

Simpler variant of thelsdj's answer:

thelsdj 答案的更简单变体:

find . -maxdepth 1 -type f -printf '%T@ %p
find . -maxdepth 1 -type f | xargs -x ls -t | awk 'NR>5' | xargs -L1 rm
' | sort -r -z -n | awk 'BEGIN { RS="
ls -tQ | tail -n+4 | xargs rm
"; ORS="
while IFS= read -rd ''; do 
    x+=("${REPLY#* }"); 
done < <(find . -maxdepth 1 -printf '%T@ %p
ls -C1 -t| awk 'NR>5'|xargs rm
' | sort -r -z -n )
"; FS="" } NR > 5 { sub("^[0-9]*(.[0-9]*)? ", ""); print }' | xargs -0 rm -f

ls -tr displays all the files, oldest first (-t newest first, -r reverse).

ls -tr 显示所有文件,最旧的在前(-t 最新在前,-r 反向)。

head -n -5 displays all but the 5 last lines (ie the 5 newest files).

head -n -5 显示除最后 5 行以外的所有行(即 5 个最新文件)。

xargs rm calls rm for each selected file.

xargs rm 为每个选定的文件调用 rm。

回答by wnoise

ls -C1 -t | awk 'NR>5' | sed -e "s/^/rm '/" -e "s/$/'/" | sh

Requires GNU find for -printf, and GNU sort for -z, and GNU awk for "\0", and GNU xargs for -0, but handles files with embedded newlines or spaces.

-printf 需要 GNU find,-z 需要 GNU sort,"\0" 需要 GNU awk,-0 需要 GNU xargs,但处理带有嵌入换行符或空格的文件。

回答by wnoise

All these answers fail when there are directories in the current directory. Here's something that works:

当当前目录中有目录时,所有这些答案都会失败。这是有效的方法:

[ 6 -le `ls *(.)|wc -l` ] && rm *(.om[6,999])

This:

这个:

  1. works when there are directories in the current directory

  2. tries to remove each file even if the previous one couldn't be removed (due to permissions, etc.)

  3. fails safe when the number of files in the current directory is excessive and xargswould normally screw you over (the -x)

  4. doesn't cater for spaces in filenames (perhaps you're using the wrong OS?)

  1. 当当前目录中有目录时工作

  2. 即使无法删除前一个文件(由于权限等),也会尝试删除每个文件

  3. 当当前目录中的文件数量过多时安全失败并且xargs通常会让你崩溃(-x

  4. 不适合文件名中的空格(也许您使用的是错误的操作系统?)

回答by Mark

##代码##

List filenames by modification time, quoting each filename. Exclude first 3 (3 most recent). Remove remaining.

按修改时间列出文件名,引用每个文件名。排除前 3 个(最近的 3 个)。去除剩余。

EDIT after helpful comment from mklement0 (thanks!): corrected -n+3 argument, and note this will not work as expected if filenames contain newlines and/or the directory contains subdirectories.

来自 mklement0 的有用评论后编辑(谢谢!):更正 -n+3 参数,并注意如果文件名包含换行符和/或目录包含子目录,这将无法按预期工作。

回答by Ian Kelling

Ignoring newlines is ignoring security and good coding. wnoise had the only good answer. Here is a variation on his that puts the filenames in an array $x

忽略换行符就是忽略安全性和良好的编码。wnoise 有唯一的好答案。这是他的一个变体,它把文件名放在一个数组 $x 中

##代码##

回答by Mark Harrison

If the filenames don't have spaces, this will work:

如果文件名没有空格,这将起作用:

##代码##

If the filenames do have spaces, something like

如果文件名确实有空格,例如

##代码##

Basic logic:

基本逻辑:

  • get a listing of the files in time order, one column
  • get all but the first 5 (n=5 for this example)
  • first version: send those to rm
  • second version: gen a script that will remove them properly
  • 按时间顺序获取文件列表,一列
  • 获取除前 5 项之外的所有项(本例中 n=5)
  • 第一个版本:将它们发送给 rm
  • 第二个版本:生成一个可以正确删除它们的脚本

回答by lolesque

With zsh

用 zsh

Assuming you don't care about present directories and you will not have more than 999 files (choose a bigger number if you want, or create a while loop).

假设您不关心当前目录,并且您的文件不会超过 999 个(如果需要,请选择更大的数字,或者创建一个 while 循环)。

##代码##

In *(.om[6,999]), the .means files, the omeans sort order up, the mmeans by date of modification (put afor access time or cfor inode change), the [6,999]chooses a range of file, so doesn't rm the 5 first.

在 中*(.om[6,999]).表示文件,o表示排序顺序,m表示按修改日期(a用于访问时间或c用于 inode 更改),[6,999]选择文件范围,因此不首先 rm 5。