Linux 迭代带有空格的文件列表

Question

提问by gregseth

I want to iterate over a list of files. This list is the result of a findcommand, so I came up with:

我想遍历文件列表。这个列表是find命令的结果，所以我想出了：

getlist() {
  for f in $(find . -iname "foo*")
  do
    echo "File found: $f"
    # do something useful
  done
}

It's fine except if a file has spaces in its name:

这很好，除非文件名中有空格：

$ ls
foo_bar_baz.txt
foo bar baz.txt

$ getlist
File found: foo_bar_baz.txt
File found: foo
File found: bar
File found: baz.txt

What can I do to avoid the split on spaces?

我该怎么做才能避免空间分割？

Answer 1

采纳答案by martin clayton

You could replace the word-based iteration with a line-based one:

您可以用基于行的迭代替换基于单词的迭代：

find . -iname "foo*" | while read f
do
    # ... loop body
done

Answer 2

回答by Torp

find . -name "fo*" -print0 | xargs -0 ls -l

See man xargs.

见man xargs。

Answer 3

回答by Karoly Horvath

find . -iname "foo*" -print0 | xargs -L1 -0 echo "File found:"

Answer 4

回答by sorpigal

There are several workable ways to accomplish this.

有几种可行的方法可以实现这一点。

If you wanted to stick closely to your original version it could be done this way:

如果您想严格遵守原始版本，可以这样做：

getlist() {
        IFS=$'\n'
        for file in $(find . -iname 'foo*') ; do
                printf 'File found: %s\n' "$file"
        done
}

This will still fail if file names have literal newlines in them, but spaces will not break it.

如果文件名中有文字换行符，这仍然会失败，但空格不会破坏它。

However, messing with IFS isn't necessary. Here's my preferred way to do this:

但是，没有必要弄乱 IFS。这是我的首选方法：

getlist() {
    while IFS= read -d $'getlist() {
        find . -iname 'foo*' -print0 | while read -d $'#!/usr/bin/env bash

dir=/tmp/getlist.test/
mkdir -p "$dir"
cd "$dir"

touch       'file not starting foo' foo foobar barfoo 'foo with spaces'\
    'foo with'$'\n'newline 'foo with trailing whitespace      '

# while with process substitution, null terminated, empty IFS
getlist0() {
    while IFS= read -d $'$ mkdir test
$ cd test
$ touch "stupid file1"
$ touch "stupid file2"
$ touch "stupid   file 3"
$ ls
stupid   file 3  stupid file1     stupid file2
$ for file in *; do echo "file: '${file}'"; done
file: 'stupid   file 3'
file: 'stupid file1'
file: 'stupid file2'
' -r file ; do
            printf 'File found: '"'%s'"'\n' "$file"
    done < <(find . -iname 'foo*' -print0)
}

# while with process substitution, null terminated, default IFS
getlist1() {
    while read -d $'shopt -s globstar
getlist() {
    for f in **/foo*
    do
        echo "File found: $f"
        # do something useful
    done
}
' -r file ; do
            printf 'File found: '"'%s'"'\n' "$file"
    done < <(find . -iname 'foo*' -print0)
}

# pipe to while, newline terminated
getlist2() {
    find . -iname 'foo*' | while read -r file ; do
            printf 'File found: '"'%s'"'\n' "$file"
    done
}

# pipe to while, null terminated
getlist3() {
    find . -iname 'foo*' -print0 | while read -d $'find . -iname "foo*" | awk '{print "mv \""$ mkdir test
$ cd test
$ touch "stupid file1"
$ touch "stupid file2"
$ touch "stupid   file 3"
"\" ./MyDir2" | "sh" }'
' -r file ; do
            printf 'File found: '"'%s'"'\n' "$file"
    done
}

# for loop over subshell results, newline terminated, default IFS
getlist4() {
    for file in "$(find . -iname 'foo*')" ; do
            printf 'File found: '"'%s'"'\n' "$file"
    done
}

# for loop over subshell results, newline terminated, newline IFS
getlist5() {
    IFS=$'\n'
    for file in $(find . -iname 'foo*') ; do
            printf 'File found: '"'%s'"'\n' "$file"
    done
}


# see how they run
for n in {0..5} ; do
    printf '\n\ngetlist%d:\n' $n
    eval getlist$n
done

rm -rf "$dir"
' -r file ; do
                printf 'File found: %s\n' "$file"
        done
}
' -r file ; do
            printf 'File found: %s\n' "$file"
    done < <(find . -iname 'foo*' -print0)
}

If you find the < <(command)syntax unfamiliar you should read about process substitution. The advantage of this over for file in $(find ...)is that files with spaces, newlines and other characters are correctly handled. This works because findwith -print0will use a null(aka \0) as the terminator for each file name and, unlike newline, null is not a legal character in a file name.

如果您发现< <(command)语法不熟悉，您应该阅读有关进程替换的内容。这样做的好处for file in $(find ...)是可以正确处理带有空格、换行符和其他字符的文件。这是有效的，因为findwith-print0将使用null(aka \0) 作为每个文件名的终止符，并且与换行符不同，null 不是文件名中的合法字符。

The advantage to this over the nearly-equivalent version

与几乎等效的版本相比，这方面的优势

readarray -t arr <<< "`ls -A1`"

Is that any variable assignment in the body of the while loop is preserved. That is, if you pipe to whileas above then the body of the whileis in a subshell which may not be what you want.

是否保留了 while 循环体中的任何变量赋值。也就是说，如果您while按上述方式使用管道，则的主体while位于子外壳中，这可能不是您想要的。

The advantage of the process substitution version over find ... -print0 | xargs -0is minimal: The xargsversion is fine if all you need is to print a line or perform a single operation on the file, but if you need to perform multiple steps the loop version is easier.

进程替换版本的优势find ... -print0 | xargs -0很小：xargs如果您只需要打印一行或对文件执行单个操作，该版本就可以了，但如果您需要执行多个步骤，则循环版本更容易。

EDIT: Here's a nice test script so you can get an idea of the difference between different attempts at solving this problem

编辑：这是一个不错的测试脚本，因此您可以了解解决此问题的不同尝试之间的差异

for i in ${!arr[@]}
do 
    newname=`echo "${arr[$i]}" | sed 's/stupid/smarter/; s/  */_/g'`; 
    mv "${arr[$i]}" "$newname"
done

Answer 5

回答by marchelbling

There is also a very simple solution: rely on bash globbing

还有一个非常简单的解决方案：依靠 bash globbing

$ ls -1
smarter_file1
smarter_file2
smarter_file_3

Note that I am not sure this behavior is the default one but I don't see any special setting in my shopt so I would go and say that it should be "safe" (tested on osx and ubuntu).

请注意，我不确定这种行为是默认行为，但我在我的商店中没有看到任何特殊设置，所以我会说它应该是“安全的”（在 osx 和 ubuntu 上测试）。

Answer 6

回答by chepner

Since you aren't doing any other type of filtering with find, you can use the following as of bash4.0:

由于您没有使用进行任何其他类型的过滤find，因此您可以从bash4.0 开始使用以下内容：

foreach file (* .*)
   echo $file
end

The **/will match zero or more directories, so the full pattern will match foo*in the current directory or any subdirectories.

在**/将匹配零个或多个目录，因此完整的模式将匹配foo*在当前目录或任何子目录。

Answer 7

回答by Steve

In some cases, here if you just need to copy or move a list of files, you could pipe that list to awk as well.
Important the \"" "\"around the field $0(in short your files, one line-list = one file).

在某些情况下，如果您只需要复制或移动文件列表，您也可以将该列表通过管道传输到 awk。
重要的是\"" "\"周围的字段$0（简而言之，您的文件，一行列表 = 一个文件）。

foreach file (* .*)
   if ("$file" == .) continue
   if ("file" == ..) continue
   echo $file
end

Answer 8

回答by terafl0ps

I really like for loops and array iteration, so I figure I will add this answer to the mix...

我真的很喜欢 for 循环和数组迭代，所以我想我会把这个答案添加到混合中......

I also liked marchelbling's stupid file example. :)

我也喜欢 Marchelbling 的愚蠢文件示例。:)

getlist() {
  for f in $(* .*)
  do
    echo "File found: $f"
    # do something useful
  done
}

Inside the test directory:

在测试目录中：

find . -iname "foo*" -exec echo "File found: {}" \;

This adds each file listing line into a bash array named arrwith any trailing newline removed.

这会将每个文件列表行添加到一个 bash 数组中，命名为arr删除任何尾随换行符。

Let's say we want to give these files better names...

假设我们想给这些文件起更好的名字...

##代码##

${!arr[@]} expands to 0 1 2 so "${arr[$i]}" is the ithelement of the array. The quotes around the variables are important to preserve the spaces.

${!arr[@]} 扩展为 0 1 2 所以 "${arr[$i]}" 是数组的第i个元素。变量周围的引号对于保留空格很重要。

The result is three renamed files:

结果是三个重命名的文件：

##代码##

Answer 9

回答by Andy Foster

Ok - my first post on Stack Overflow!

好的 - 我在 Stack Overflow 上的第一篇文章！

Though my problems with this have always been in csh not bash the solution I present will, I'm sure, work in both. The issue is with the shell's interpretation of the "ls" returns. We can remove "ls" from the problem by simply using the shell expansion of the *wildcard - but this gives a "no match" error if there are no files in the current (or specified folder) - to get around this we simply extend the expansion to include dot-files thus: * .*- this will always yield results since the files . and .. will always be present. So in csh we can use this construct ...

虽然我的问题一直在 csh 中，而不是 bash 我提出的解决方案，但我敢肯定，两者都适用。问题在于 shell 对“ls”返回的解释。我们可以通过简单地使用*通配符的 shell 扩展来从问题中删除“ls” - 但是如果当前（或指定文件夹）中没有文件，则会出现“不匹配”错误 - 为了解决这个问题，我们只需扩展扩展以包含点文件，因此：* .*- 这将始终产生结果，因为文件 . 并且 .. 将永远存在。所以在 csh 中我们可以使用这个构造......

##代码##

if you want to filter out the standard dot-files then that is easy enough ...

如果你想过滤掉标准的点文件，那么这很容易......

##代码##

The code in the first post on this thread would be written thus:-

此线程上第一篇文章中的代码将这样编写：-

##代码##

Hope this helps!

希望这可以帮助！

Answer 10

回答by naught101

findhas an -execargument that loops over the find results and executes an arbitrary command. For example:

find有一个-exec参数循环查找结果并执行任意命令。例如：

##代码##

Here {}represents the found files, and wrapping it in ""allows for the resultant shell command to deal with spaces in the file name.

这里{}代表找到的文件，将它包装起来""允许结果 shell 命令处理文件名中的空格。

In many cases you can replace that last \;(which starts a new command) with a \+, which will put multiple files in the one command (not necessarily all of them at once though, see man findfor more details).

在许多情况下，您可以用\;a替换最后一个（启动一个新命令）\+，这会将多个文件放入一个命令中（但不一定一次将所有文件全部放入，请参阅man find有关更多详细信息）。

Linux 迭代带有空格的文件列表

提问by gregseth

采纳答案by martin clayton

回答by Torp

回答by Karoly Horvath

回答by sorpigal

回答by marchelbling

回答by chepner

回答by Steve

回答by terafl0ps

回答by Andy Foster

回答by naught101

相关推荐

最近更新

标签

Linux 迭代带有空格的文件列表

提问by gregseth

采纳答案by martin clayton

回答by Torp

回答by Karoly Horvath

回答by sorpigal

回答by marchelbling

回答by chepner

回答by Steve

回答by terafl0ps

回答by Andy Foster

回答by naught101

相关推荐

从 PHP/Linux 读取 Foxpro 文件 (.DBF)

Change Names of Multiple Files Linux

为什么堆栈溢出会导致分段错误而不是 Linux 中的堆栈溢出？

Linux 在压缩文件上使用 sed

相关推荐

最近更新

标签