遍历文件名列表,以便它们在 bash 中创建

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/25577074/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 11:15:31  来源:igfitidea点击:

Iterate through list of filenames in order they were created in bash

bash

提问by jaypal singh

Parsing output of lsto iterate through list of files is bad. So how should I go about iterating through list of files in order by which they were first created? I browsed several questions here on SO and they all seem to parsing ls.

解析输出ls以遍历文件列表是错误的。那么我应该如何按照文件列表的首次创建顺序进行迭代呢?我在这里浏览了几个关于 SO 的问题,它们似乎都在解析ls.

The embedded link suggests:

嵌入的链接表明:

Things get more difficult if you wanted some specific sorting that only lscan do, such as ordering by mtime. If you want the oldest or newest file in a directory, don't use ls -t | head -1-- read Bash FAQ 99 instead. If you truly need a list of all the files in a directory in order by mtime so that you can process them in sequence, switch to perl, and have your perl program do its own directory opening and sorting. Then do the processing in the perl program, or -- worst case scenario -- have the perl program spit out the filenames with NUL delimiters.

Even better, put the modification time in the filename, in YYYYMMDD format, so that glob order is also mtime order. Then you don't need ls or perl or anything. (The vast majority of cases where people want the oldest or newest file in a directory can be solved just by doing this.)

如果您想要某些只能ls执行的特定排序(例如按 排序),事情会变得更加困难mtime。如果您想要目录中最旧或最新的文件,请不要使用ls -t | head -1-- 阅读 Bash FAQ 99。如果您确实需要按 mtime 顺序列出目录中的所有文件,以便您可以按顺序处理它们,请切换到 perl,并让您的 perl 程序自行打开目录并进行排序。然后在 perl 程序中进行处理,或者——最坏的情况——让 perl 程序吐出带有 NUL 分隔符的文件名。

更好的是,将修改时间放在文件名中,格式为 YYYYMMDD,这样 glob 顺序也是 mtime 顺序。那么你不需要 ls 或 perl 或任何东西。(绝大多数情况下,人们想要目录中最旧或最新的文件都可以通过这样做来解决。)

Does that mean there is nonative way of doing it in bash? I don't have the liberty to modify the filename to include the time in them. I need to schedule a script in cronthat would run every 5 minutes, generate an array containing all the files in a particular directory ordered by their creation time and perform some actions on the filenames and move them to another location.

这是否意味着没有本地的方式来做到这一点bash?我没有自由修改文件名以在其中包含时间。我需要安排一个cron每 5 分钟运行一次的脚本,生成一个包含特定目录中按创建时间排序的所有文件的数组,并对文件名执行一些操作并将它们移动到另一个位置。

The following worked but only because I don't have funny filenames. The files are created by a server so it will never have special characters, spaces, newlines etc.

以下有效,但只是因为我没有有趣的文件名。这些文件是由服务器创建的,因此它永远不会有特殊字符、空格、换行符等。

files=( $(ls -1tr) ) 

I can write a perlscript that would do what I need but I would appreciate if someone can suggest the right way to do it in bash. Portable option would be great but solution using latest GNU utilities will not be a problem either.

我可以编写一个perl脚本来满足我的需要,但如果有人可以建议在bash. 便携式选项会很棒,但使用最新的 GNU 实用程序的解决方案也不会成为问题。

采纳答案by user123444555621

sorthelper=();
for file in *; do
    # We need something that can easily be sorted.
    # Here, we use "<date><filename>".
    # Note that this works with any special characters in filenames

    sorthelper+=("$(stat -n -f "%Sm%N" -t "%Y%m%d%H%M%S" -- "$file")"); # Mac OS X only
    # or
    sorthelper+=("$(stat --printf "%Y    %n" -- "$file")"); # Linux only
done;

sorted=();
while read -d $'
stat -c '%Y %n' * | sort -t ' ' -nk1 | cut -d ' ' -f2-
' elem; do # this strips away the first 14 characters (<date>) sorted+=("${elem:14}"); done < <(printf '%s
LANG=C stat -c '%Y^A%N' *| sort -t '^A' -nk1| awk -F '^A' '{print substr(,2,length()-2)}'
' "${sorthelper[@]}" | sort -z) for file in "${sorted[@]}"; do # do your stuff... echo "$file"; done;

Other than sortand stat, all commands are actual native Bash commands (builtins)*. If you really want, you can implement your own sortusing Bash builtins only, but I see no way of getting rid of stat.

除了sortand 之外stat,所有命令都是实际的本地 Bash 命令(内置命令)*。如果你真的想要,你可以只使用 Bash 内置函数来实现你自己的sort,但我认为没有办法摆脱stat.

The important parts are read -d $'\0', printf '%s\0'and sort -z. All these commands are used with their null-delimiter options, which means that any filename can be procesed safely. Also, the use of double-quotes in "$file"and "${anarray[*]}"is essential.

重要的部分是read -d $'\0',printf '%s\0'sort -z。所有这些命令都与其空分隔符选项一起使用,这意味着可以安全地处理任何文件名。此外,在使用双引号的"$file""${anarray[*]}"是必不可少的。

*Many people feel that the GNU tools are somehow part of Bash, but technically they're not. So, statand sortare just as non-native as perl.

*许多人认为 GNU 工具在某种程度上是 Bash 的一部分,但从技术上讲它们不是。因此,statsortperl.

回答by anubhava

You can try using use statcommand piped with sort:

您可以尝试使用通过stat管道传输的use命令sort

ls -1rt | while read -r fname; do  # where '1' is ONE not little 'L'

Update:To deal with filename with newlines we can use %Nformat in statandInstead of cutwe can use awklike this:

更新:要处理带换行符的文件名,我们可以%Nstatand 中使用格式,而cut不是awk像这样使用:

    #!/bin/bash
    for i in $( ls ); do
        echo item: $i
    done
  1. Use of LANG=Cis needed to make sure statuses single quotes only in quoting file names.
  2. ^Ais conrtrol-Acharacter typed using ControlVAkeys together.
  1. LANG=C需要使用 of来确保stat仅在引用文件名时使用单引号。
  2. ^A是一起conrtrol-A使用ControlVA键键入的字符。

回答by David C. Rankin

With all of the cautions and warningsagainst using lsto parse a directory notwithstanding, we have all found ourselves in this situation. If you do find yourself needing sorted directory input, then about the cleanest use of lsto feed your loop is ls -opts | read -r name; do...This will handle spaces in filenames, etc.. without requiring a reset of IFSdue to the nature of readitself. Example:

尽管有所有关于使用解析目录的注意事项和警告ls但我们都发现自己处于这种情况。如果您确实发现自己需要排序的目录输入,那么最干净地使用ls来馈送您的循环是ls -opts | read -r name; do...这将处理文件名中的空格等。IFS由于其read本身的性质,不需要重置。例子:

find . -type f -printf '%T@ %p\n' | sort -k 1nr | sed 's/^[^ ]* //'

So do look for cleaner solutions avoiding ls, but if push comes to shove, ls -optscan be used sparingly without the sky falling or dragons plucking your eyes out.

所以一定要寻找更清洁的解决方案ls,但如果迫不得已,ls -opts可以谨慎使用,而不会天塌下来或龙挖出你的眼睛。

let me add the disclaimerto keep everyone happy. If you like newlinesinside your filenames -- then do notuse lsto populate a loop. If you do not have newlinesinside your filenames, there are no other adverse side-effects.

让我添加免责声明,让每个人都开心。如果你喜欢newlines你的文件名里-然后使用ls填充循环。如果您newlines的文件名中没有,则没有其他不利的副作用。

Contra:TLDP Bash Howto Intro:

反对:TLDP Bash Howto 介绍

files=( *(oc) )

It appears that SO users do not know what the use of contrameans -- please look it up before downvoting.

似乎 SO 用户不知道contra的使用意味着什么——请在投票前查一下。

回答by Burhan Khalid

How about a solution with GNU find+ sed+ sort?

如何与解决方案GNUfind+ sed+ sort

As long as there are no newlines in the file name, this should work:

只要文件名中没有换行符,这应该有效:

files=( *(.oc) )

回答by Sam Varshavchik

Each file has three timestamps:

每个文件都有三个时间戳:

  1. Access time: the file was opened and read. Also known as atime.
  2. Modification time: the file was written to. Also known as mtime.
  3. Inode modification time: the file's status was changed, such as the file had a new hard link created, or an existing one removed; or if the file's permissions were chmod-ed, or a few other things. Also known as ctime.
  1. 访问时间:文件被打开和读取。又称atime的
  2. 修改时间:文件被写入。也称为mtime
  3. inode 修改时间: 文件的状态发生了变化,例如文件创建了新的硬链接,或者删除了现有的硬链接;或者文件的权限是否被 chmod 修改,或者其他一些事情。也称为ctime

Neither one represents the time the file was created, that information is not saved anywhere. At file creation time, all three timestamps are initialized, and then each one gets updated appropriately, when the file is read, or written to, or when a file's permissions are chmoded, or a hard link created or destroyed.

两者都不代表文件的创建时间,该信息不会保存在任何地方。在文件创建时,所有三个时间戳都被初始化,然后在读取或写入文件时,或者当文件的权限被 chmoded 时,或者创建或销毁硬链接时,每个时间戳都会得到适当的更新。

So, you can't really list the files according to their file creation time, because the file creation time isn't saved anywhere. The closest match would be the inode modification time.

因此,您无法真正根据文件创建时间列出文件,因为文件创建时间不会保存在任何地方。最接近的匹配将是 inode 修改时间。

See the descriptions of the -t, -u, -c, and -roptions in the ls(1) man pagefor more information on how to list files in atime, mtime, or ctime order.

见的描述-t-u-c,并-r选择在LS(1)手册页有关详细信息,如何会将atime,mtime或订单的ctime列表文件。

回答by chepner

It may be a little more work to ensure it is installed (it may already be, though), but using zshinstead of bashfor this script makes a lot of sense. The filename globbing capabilities are much richer, while still using a sh-like language.

确保安装它可能需要做更多的工作(虽然它可能已经安装了),但是使用这个脚本zsh代替bash它很有意义。文件名通配功能更加丰富,同时仍然使用类似sh语言。

while read -r fname; do
    fname=${fname:1:((${#fname}-2))} # remove the leading and trailing "
    fname=${fname//\\"/\"}          # removed the \ before any embedded "
    fname=$(echo -e "$fname")        # interpret the escaped characters
    file "$fname"                    # replace (YOU) `file` with anything
done < <(ls -At --quoting-style=c)

will create an array whose entries are all the file names in the current directory, but sorted by change time. (Use a capital O instead to reverse the sort order). This will include directories, but you can limit the match to regular files (similar to the -type fpredicate to find):

将创建一个数组,其条目是当前目录中的所有文件名,但按更改时间排序。(使用大写 O 来反转排序顺序)。这将包括目录,但您可以将匹配限制为常规文件(类似于-type f谓词find):

$ ls -A
 filename with spaces   .hidden_filename  filename?with_a_tab  filename?with_a_newline  filename_"with_double_quotes"

$ ls -At --quoting-style=c
".hidden_filename"  " filename with spaces "  "filename_\"with_double_quotes\""  "filename\nwith_a_newline"  "filename\twith_a_tab"

findis needed far less often in zshscripts, because most of its uses are covered by the various glob flags and qualifiers available.

findzsh脚本中很少需要它,因为它的大部分用途都包含在各种可用的 glob 标志和限定符中。

回答by whoan

I've just found a way to do it with bashand ls(GNU).
Suppose you want to iterate through the filenames sorted by modification time(-t):

我刚刚找到了一种使用bashand ls(GNU) 的方法。
假设您要遍历按修改时间( -t)排序的文件名:

${fname:1:((${#fname}-2))} # remove the leading and trailing "
# ".hidden_filename" -> .hidden_filename
${fname//\\"/\"}          # removed the \ before any embedded "
# filename_\"with_double_quotes\" -> filename_"with_double_quotes"
$(echo -e "$fname")        # interpret the escaped characters
# filename\twith_a_tab -> filename     with_a_tab

Explanation

解释

Given some filenames with special characters, this is the lsoutput:

给定一些带有特殊字符的文件名,这是ls输出:

$ ./script.sh
.hidden_filename: empty
 filename with spaces : empty
filename_"with_double_quotes": empty
filename
with_a_newline: empty
filename    with_a_tab: empty

So you have to process a little each filename to get the actual one. Recalling:

因此,您必须对每个文件名稍加处理才能获得实际的文件名。回忆:

n=0
declare -A arr
for file in *; do
    # modified=$(stat -f "%m" "$file") # For use with BSD/OS X
    modified=$(stat -c "%Y" "$file") # For use with GNU/Linux
    # Ensure stat timestamp is unique
    if [[ $modified == *"${!arr[@]}"* ]]; then
        modified=${modified}.$n
        ((n++))
    fi
    arr[$modified]="$file"
done
files=()
for index in $(IFS=$'\n'; echo "${!arr[*]}" | sort -n); do
    files+=("${arr[$index]}")
done

Example

例子

##代码##

As seen, file(or the command you want) interprets well each filename.

正如所见,file(或您想要的命令)很好地解释了每个文件名。

回答by John B

Here's a way using statwith an associative array.

这是一种stat与关联数组一起使用的方法。

##代码##

Since sortsorts lines, $(IFS=$'\n'; echo "${!arr[*]}" | sort -n)ensures the indices of the associative array get sorted by setting the field separator in the subshell to a newline.

由于对sort$(IFS=$'\n'; echo "${!arr[*]}" | sort -n)进行排序,通过将子外壳中的字段分隔符设置为换行符来确保对关联数组的索引进行排序。

The quoting at arr[$modified]="${file}"and files+=("${arr[$index]}")ensures that file names with caveats like a newline are preserved.

引用 atarr[$modified]="${file}"files+=("${arr[$index]}")确保保留带有换行符等警告的文件名。