遍历文件名列表,以便它们在 bash 中创建
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/25577074/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Iterate through list of filenames in order they were created in bash
提问by jaypal singh
Parsing output of ls
to iterate through list of files is bad. So how should I go about iterating through list of files in order by which they were first created? I browsed several questions here on SO and they all seem to parsing ls
.
解析输出ls
以遍历文件列表是错误的。那么我应该如何按照文件列表的首次创建顺序进行迭代呢?我在这里浏览了几个关于 SO 的问题,它们似乎都在解析ls
.
The embedded link suggests:
嵌入的链接表明:
Things get more difficult if you wanted some specific sorting that only
ls
can do, such as ordering bymtime
. If you want the oldest or newest file in a directory, don't usels -t | head -1
-- read Bash FAQ 99 instead. If you truly need a list of all the files in a directory in order by mtime so that you can process them in sequence, switch to perl, and have your perl program do its own directory opening and sorting. Then do the processing in the perl program, or -- worst case scenario -- have the perl program spit out the filenames with NUL delimiters.Even better, put the modification time in the filename, in YYYYMMDD format, so that glob order is also mtime order. Then you don't need ls or perl or anything. (The vast majority of cases where people want the oldest or newest file in a directory can be solved just by doing this.)
如果您想要某些只能
ls
执行的特定排序(例如按 排序),事情会变得更加困难mtime
。如果您想要目录中最旧或最新的文件,请不要使用ls -t | head -1
-- 阅读 Bash FAQ 99。如果您确实需要按 mtime 顺序列出目录中的所有文件,以便您可以按顺序处理它们,请切换到 perl,并让您的 perl 程序自行打开目录并进行排序。然后在 perl 程序中进行处理,或者——最坏的情况——让 perl 程序吐出带有 NUL 分隔符的文件名。更好的是,将修改时间放在文件名中,格式为 YYYYMMDD,这样 glob 顺序也是 mtime 顺序。那么你不需要 ls 或 perl 或任何东西。(绝大多数情况下,人们想要目录中最旧或最新的文件都可以通过这样做来解决。)
Does that mean there is nonative way of doing it in bash
? I don't have the liberty to modify the filename to include the time in them. I need to schedule a script in cron
that would run every 5 minutes, generate an array containing all the files in a particular directory ordered by their creation time and perform some actions on the filenames and move them to another location.
这是否意味着没有本地的方式来做到这一点bash
?我没有自由修改文件名以在其中包含时间。我需要安排一个cron
每 5 分钟运行一次的脚本,生成一个包含特定目录中按创建时间排序的所有文件的数组,并对文件名执行一些操作并将它们移动到另一个位置。
The following worked but only because I don't have funny filenames. The files are created by a server so it will never have special characters, spaces, newlines etc.
以下有效,但只是因为我没有有趣的文件名。这些文件是由服务器创建的,因此它永远不会有特殊字符、空格、换行符等。
files=( $(ls -1tr) )
I can write a perl
script that would do what I need but I would appreciate if someone can suggest the right way to do it in bash
. Portable option would be great but solution using latest GNU utilities will not be a problem either.
我可以编写一个perl
脚本来满足我的需要,但如果有人可以建议在bash
. 便携式选项会很棒,但使用最新的 GNU 实用程序的解决方案也不会成为问题。
采纳答案by user123444555621
sorthelper=();
for file in *; do
# We need something that can easily be sorted.
# Here, we use "<date><filename>".
# Note that this works with any special characters in filenames
sorthelper+=("$(stat -n -f "%Sm%N" -t "%Y%m%d%H%M%S" -- "$file")"); # Mac OS X only
# or
sorthelper+=("$(stat --printf "%Y %n" -- "$file")"); # Linux only
done;
sorted=();
while read -d $'stat -c '%Y %n' * | sort -t ' ' -nk1 | cut -d ' ' -f2-
' elem; do
# this strips away the first 14 characters (<date>)
sorted+=("${elem:14}");
done < <(printf '%sLANG=C stat -c '%Y^A%N' *| sort -t '^A' -nk1| awk -F '^A' '{print substr(,2,length()-2)}'
' "${sorthelper[@]}" | sort -z)
for file in "${sorted[@]}"; do
# do your stuff...
echo "$file";
done;
Other than sort
and stat
, all commands are actual native Bash commands (builtins)*. If you really want, you can implement your own sort
using Bash builtins only, but I see no way of getting rid of stat
.
除了sort
and 之外stat
,所有命令都是实际的本地 Bash 命令(内置命令)*。如果你真的想要,你可以只使用 Bash 内置函数来实现你自己的sort
,但我认为没有办法摆脱stat
.
The important parts are read -d $'\0'
, printf '%s\0'
and sort -z
. All these commands are used with their null-delimiter options, which means that any filename can be procesed safely. Also, the use of double-quotes in "$file"
and "${anarray[*]}"
is essential.
重要的部分是read -d $'\0'
,printf '%s\0'
和sort -z
。所有这些命令都与其空分隔符选项一起使用,这意味着可以安全地处理任何文件名。此外,在使用双引号的"$file"
和"${anarray[*]}"
是必不可少的。
*Many people feel that the GNU tools are somehow part of Bash, but technically they're not. So, stat
and sort
are just as non-native as perl
.
*许多人认为 GNU 工具在某种程度上是 Bash 的一部分,但从技术上讲它们不是。因此,stat
和sort
与perl
.
回答by anubhava
You can try using use stat
command piped with sort
:
您可以尝试使用通过stat
管道传输的use命令sort
:
ls -1rt | while read -r fname; do # where '1' is ONE not little 'L'
Update:To deal with filename with newlines we can use %N
format in stat
andInstead of cut
we can use awk
like this:
更新:要处理带换行符的文件名,我们可以%N
在stat
and 中使用格式,而cut
不是awk
像这样使用:
#!/bin/bash
for i in $( ls ); do
echo item: $i
done
- Use of
LANG=C
is needed to make surestat
uses single quotes only in quoting file names. ^A
isconrtrol-A
character typed using ControlVAkeys together.
LANG=C
需要使用 of来确保stat
仅在引用文件名时使用单引号。^A
是一起conrtrol-A
使用ControlVA键键入的字符。
回答by David C. Rankin
With all of the cautions and warningsagainst using ls
to parse a directory notwithstanding, we have all found ourselves in this situation. If you do find yourself needing sorted directory input, then about the cleanest use of ls
to feed your loop is ls -opts | read -r name; do...
This will handle spaces in filenames, etc.. without requiring a reset of IFS
due to the nature of read
itself. Example:
尽管有所有关于使用解析目录的注意事项和警告,ls
但我们都发现自己处于这种情况。如果您确实发现自己需要排序的目录输入,那么最干净地使用ls
来馈送您的循环是ls -opts | read -r name; do...
这将处理文件名中的空格等。IFS
由于其read
本身的性质,不需要重置。例子:
find . -type f -printf '%T@ %p\n' | sort -k 1nr | sed 's/^[^ ]* //'
So do look for cleaner solutions avoiding ls
, but if push comes to shove, ls -opts
can be used sparingly without the sky falling or dragons plucking your eyes out.
所以一定要寻找更清洁的解决方案ls
,但如果迫不得已,ls -opts
可以谨慎使用,而不会天塌下来或龙挖出你的眼睛。
let me add the disclaimerto keep everyone happy. If you like newlines
inside your filenames -- then do notuse ls
to populate a loop. If you do not have newlines
inside your filenames, there are no other adverse side-effects.
让我添加免责声明,让每个人都开心。如果你喜欢newlines
你的文件名里-然后不使用ls
填充循环。如果您newlines
的文件名中没有,则没有其他不利的副作用。
Contra:TLDP Bash Howto Intro:
files=( *(oc) )
It appears that SO users do not know what the use of contrameans -- please look it up before downvoting.
似乎 SO 用户不知道contra的使用意味着什么——请在投票前查一下。
回答by Burhan Khalid
回答by Sam Varshavchik
Each file has three timestamps:
每个文件都有三个时间戳:
- Access time: the file was opened and read. Also known as atime.
- Modification time: the file was written to. Also known as mtime.
- Inode modification time: the file's status was changed, such as the file had a new hard link created, or an existing one removed; or if the file's permissions were chmod-ed, or a few other things. Also known as ctime.
- 访问时间:文件被打开和读取。又称atime的。
- 修改时间:文件被写入。也称为mtime。
- inode 修改时间: 文件的状态发生了变化,例如文件创建了新的硬链接,或者删除了现有的硬链接;或者文件的权限是否被 chmod 修改,或者其他一些事情。也称为ctime。
Neither one represents the time the file was created, that information is not saved anywhere. At file creation time, all three timestamps are initialized, and then each one gets updated appropriately, when the file is read, or written to, or when a file's permissions are chmoded, or a hard link created or destroyed.
两者都不代表文件的创建时间,该信息不会保存在任何地方。在文件创建时,所有三个时间戳都被初始化,然后在读取或写入文件时,或者当文件的权限被 chmoded 时,或者创建或销毁硬链接时,每个时间戳都会得到适当的更新。
So, you can't really list the files according to their file creation time, because the file creation time isn't saved anywhere. The closest match would be the inode modification time.
因此,您无法真正根据文件创建时间列出文件,因为文件创建时间不会保存在任何地方。最接近的匹配将是 inode 修改时间。
See the descriptions of the -t
, -u
, -c
, and -r
options in the ls(1) man pagefor more information on how to list files in atime, mtime, or ctime order.
见的描述-t
,-u
,-c
,并-r
选择在LS(1)手册页有关详细信息,如何会将atime,mtime或订单的ctime列表文件。
回答by chepner
It may be a little more work to ensure it is installed (it may already be, though), but using zsh
instead of bash
for this script makes a lot of sense. The filename globbing capabilities are much richer, while still using a sh
-like language.
确保安装它可能需要做更多的工作(虽然它可能已经安装了),但是使用这个脚本zsh
代替bash
它很有意义。文件名通配功能更加丰富,同时仍然使用类似sh
语言。
while read -r fname; do
fname=${fname:1:((${#fname}-2))} # remove the leading and trailing "
fname=${fname//\\"/\"} # removed the \ before any embedded "
fname=$(echo -e "$fname") # interpret the escaped characters
file "$fname" # replace (YOU) `file` with anything
done < <(ls -At --quoting-style=c)
will create an array whose entries are all the file names in the current directory, but sorted by change time. (Use a capital O instead to reverse the sort order). This will include directories, but you can limit the match to regular files (similar to the -type f
predicate to find
):
将创建一个数组,其条目是当前目录中的所有文件名,但按更改时间排序。(使用大写 O 来反转排序顺序)。这将包括目录,但您可以将匹配限制为常规文件(类似于-type f
谓词find
):
$ ls -A
filename with spaces .hidden_filename filename?with_a_tab filename?with_a_newline filename_"with_double_quotes"
$ ls -At --quoting-style=c
".hidden_filename" " filename with spaces " "filename_\"with_double_quotes\"" "filename\nwith_a_newline" "filename\twith_a_tab"
find
is needed far less often in zsh
scripts, because most of its uses are covered by the various glob flags and qualifiers available.
find
在zsh
脚本中很少需要它,因为它的大部分用途都包含在各种可用的 glob 标志和限定符中。
回答by whoan
I've just found a way to do it with bash
and ls
(GNU).
Suppose you want to iterate through the filenames sorted by modification time(-t
):
我刚刚找到了一种使用bash
and ls
(GNU) 的方法。
假设您要遍历按修改时间( -t
)排序的文件名:
${fname:1:((${#fname}-2))} # remove the leading and trailing "
# ".hidden_filename" -> .hidden_filename
${fname//\\"/\"} # removed the \ before any embedded "
# filename_\"with_double_quotes\" -> filename_"with_double_quotes"
$(echo -e "$fname") # interpret the escaped characters
# filename\twith_a_tab -> filename with_a_tab
Explanation
解释
Given some filenames with special characters, this is the ls
output:
给定一些带有特殊字符的文件名,这是ls
输出:
$ ./script.sh
.hidden_filename: empty
filename with spaces : empty
filename_"with_double_quotes": empty
filename
with_a_newline: empty
filename with_a_tab: empty
So you have to process a little each filename to get the actual one. Recalling:
因此,您必须对每个文件名稍加处理才能获得实际的文件名。回忆:
n=0
declare -A arr
for file in *; do
# modified=$(stat -f "%m" "$file") # For use with BSD/OS X
modified=$(stat -c "%Y" "$file") # For use with GNU/Linux
# Ensure stat timestamp is unique
if [[ $modified == *"${!arr[@]}"* ]]; then
modified=${modified}.$n
((n++))
fi
arr[$modified]="$file"
done
files=()
for index in $(IFS=$'\n'; echo "${!arr[*]}" | sort -n); do
files+=("${arr[$index]}")
done
Example
例子
##代码##As seen, file
(or the command you want) interprets well each filename.
正如所见,file
(或您想要的命令)很好地解释了每个文件名。
回答by John B
Here's a way using stat
with an associative array.
这是一种stat
与关联数组一起使用的方法。
Since sort
sorts lines, $(IFS=$'\n'; echo "${!arr[*]}" | sort -n)
ensures the indices of the associative array get sorted by setting the field separator in the subshell to a newline.
由于对sort
行$(IFS=$'\n'; echo "${!arr[*]}" | sort -n)
进行排序,通过将子外壳中的字段分隔符设置为换行符来确保对关联数组的索引进行排序。
The quoting at arr[$modified]="${file}"
and files+=("${arr[$index]}")
ensures that file names with caveats like a newline are preserved.
引用 atarr[$modified]="${file}"
并files+=("${arr[$index]}")
确保保留带有换行符等警告的文件名。