用于比较文件的 Bash 脚本

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/41412937/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 15:34:42  来源:igfitidea点击:

Bash script to compare files

bashsorting

提问by Astrum

I have a folder with a ton of old photos with many duplicates. Sorting it by hand would take ages, so I wanted to use the opportunity to use bash.

我有一个文件夹,里面有很多重复的旧照片。手动排序需要很长时间,所以我想借此机会使用 bash。

Right now I have the code:

现在我有代码:

#!/bin/bash

directory="~/Desktop/Test/*"
for file in ${directory};
do
    for filex in ${directory}:
    do
        if [ $( diff {$file} {$filex} ) == 0 ]
        then
            mv ${filex} ~/Desktop
            break
        fi
    done
done 

And getting the exit code:

并获取退出代码:

diff: {~/Desktop/Test/*}: No such file or directory
diff: {~/Desktop/Test/*:}: No such file or directory
File_compare: line 8: [: ==: unary operator expected

I've tried modifying working code I've found online, but it always seems to spit out some error like this. I'm guessing it's a problem with the nested for loop?

我试过修改我在网上找到的工作代码,但它似乎总是吐出一些这样的错误。我猜这是嵌套 for 循环的问题?

Also, why does it seem there are different ways to call variables? I've seen examples that use ${file}, "$file", and "${file}".

另外,为什么似乎有不同的方法来调用变量?我见过使用${file}, "$file", and "${file}".

回答by Jonathan Leffler

You have the {}in the wrong places:

{}在错误的地方:

if [ $( diff {$file} {$filex} ) == 0 ]
if [ $( diff {$file} {$filex} ) == 0 ]

They should be at:

他们应该在:

if [ $( diff ${file} ${filex} ) == 0 ]

(though the braces are optional now), but you should allow for spaces in the file names:

(虽然现在大括号是可选的),但你应该允许文件名中有空格:

if [ $( diff "${file}" "${filex}" ) == 0 ]

Now it simply doesn't work properly because when difffinds no differences, it generates no output (and you get errors because the ==operator doesn't expect nothing on its left-side). You could sort of fix it by double quoting the value from $(…)(if [ "$( diff … )" == "" ]), but you should simply and directly test the exit status of diff:

现在它根本无法正常工作,因为当diff没有发现任何差异时,它不会生成任何输出(并且您会收到错误,因为==操作员不希望其左侧没有任何内容)。您可以通过双引号$(…)( if [ "$( diff … )" == "" ]) 中的值来修复它,但您应该简单直接地测试 的退出状态diff

if diff "${file}" "${filex}"
then : no difference
else : there is a difference
fi

and maybe for comparing images you should be using cmp(in silent mode) rather than diff:

也许为了比较您应该使用的图像cmp(在静音模式下)而不是diff

if cmp -s "$file" "$filex"
then : no difference
else : there is a difference
fi

回答by Gordon Davisson

In addition to the problems Jonathan Leffler pointed out:

除了 Jonathan Leffler 指出的问题:

directory="~/Desktop/Test/*"
for file in ${directory};

~and *won't get expanded inside double-quotes; the *will get expanded when you use the variable without quotes, but since the ~won't, it's looking for files under an directory actually named "~" (notyour home directory), it won't find any matches. Also, as Jonathan pointed out, using variables (like ${directory}) without double-quotes will run you into trouble with filenames that contain spaces or some other metacharacters. The better way to do this is to not put the wildcard in the variable, use it when you reference the variable, with the variable in double-quotes and the *outside them:

~并且*不会在双引号内扩展;*当您使用不带引号的变量时,它将被扩展,但由于~不会,它会在实际命名为“~”的目录(不是您的主目录)下寻找文件,它不会找到任何匹配项。此外,正如乔纳森指出的那样,使用${directory}不带双引号的变量(如)会使您遇到包含空格或其他一些元字符的文件名的麻烦。更好的方法是不要将通配符放在变量中,在引用变量时使用它,变量用双引号括起来,*在它们外面:

directory=~/"Desktop/Test"
for file in "${directory}"/*;

Oh, and another note: when using mvin a script it's a good idea to use mv -ito avoid accidentally overwriting another file with the same name.

哦,还有一个注意事项:mv在脚本中使用时,最好使用它mv -i以避免意外覆盖另一个同名文件。

And: use shellcheck.netto sanity-check your code and point out common mistakes.

并且:使用shellcheck.net对您的代码进行完整性检查并指出常见错误。

回答by codeforester

If you are simply interested in knowing if two files differ, cmpis the best option. Its advantages are:

如果您只是想知道两个文件是否不同,这cmp是最好的选择。它的优点是:

  1. It works for text as well as binary files, unlike diffwhich is for text files only

  2. It stops after finding the first difference, and hence it is very efficient

  1. 它适用于文本和二进制文件,与diff仅适用于文本文件不同

  2. 它在找到第一个差异后停止,因此非常有效

So, your code could be written as:

所以,你的代码可以写成:

if ! cmp -s "$file" "$filex"; then
  # files differ...
  mv "$filex" ~/Desktop

  # any other logic here
fi

Hope this helps. I didn't understand what you are trying to do with your loops and hence didn't write the full code.

希望这可以帮助。我不明白你想用循环做什么,因此没有编写完整的代码。

回答by Bertrand Martel

You can use diff "$file" "$filex" &>/dev/nulland get the last command result with $?:

您可以使用diff "$file" "$filex" &>/dev/null并获取最后一个命令结果$?

#!/bin/bash

SEARCH_DIR="."
DEST_DIR="./result"

mkdir -p "$DEST_DIR"

directory="."

ls $directory | while read file;
do
    ls $directory | while read filex;
    do
        if [ ! -d "$filex" ] && [ ! -d "$file" ] && [ "$filex" != "$file" ];
        then

            diff "$file" "$filex" &>/dev/null

            if [ "$?" == 0 ];
            then
                echo "$filex is a duplicate. Copying to $DEST_DIR"
                mv "$filex" "$DEST_DIR"
            fi
        fi
    done
done 

Note that you can also use fslintor fdupesutilities to find duplicates

请注意,您还可以使用fslintfdupes实用程序来查找重复项