bash md5 目录树中的所有文件
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/36920307/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
md5 all files in a directory tree
提问by Bleakley
I have a a directory with a structure like so:
我有一个结构如下的目录:
.
├── Test.txt
├── Test1
│?? ├── Test1.txt
│?? ├── Test1_copy.txt
│?? └── Test1a
│?? ├── Test1a.txt
│?? └── Test1a_copy.txt
└── Test2
├── Test2.txt
├── Test2_copy.txt
└── Test2a
├── Test2a.txt
└── Test2a_copy.txt
I would like to create a bash script that makes a md5 checksum of every file in this directory. I want to be able to type the script name in the CLI and then the path to the directory I want to hash and have it work. I'm sure there are many ways to accomplish this. Currently I have:
我想创建一个 bash 脚本,该脚本对该目录中的每个文件进行 md5 校验和。我希望能够在 CLI 中输入脚本名称,然后输入我想要散列并使其工作的目录的路径。我相信有很多方法可以实现这一点。目前我有:
#!/bin/bash
for file in "" ; do
md5 >> "__checksums.md5"
done
This just hangs and it not working. Perhaps I should use find?
这只是挂起,它不起作用。也许我应该使用查找?
One caveat - the directories I want to hash will have files with different extensions and may not always have this exact same tree structure. I want something that will work in these different situations, as well.
一个警告 - 我想要散列的目录将包含具有不同扩展名的文件,并且可能并不总是具有完全相同的树结构。我也想要一些可以在这些不同情况下工作的东西。
回答by TeWu
Using md5deep
使用 md5deep
md5deep -r path/to/dir > sums.md5
Using find
and md5sum
使用find
和md5sum
find relative/path/to/dir -type f -exec md5sum {} + > sums.md5
Be aware, that when you run check on your MD5 sums with md5sum -c sums.md5
, you need to run it from the same directory from which you generated sums.md5
file. This is because find
outputs paths that are relative to your current location, which are then put into sums.md5
file.
请注意,当您使用 对 MD5 总和运行检查时md5sum -c sums.md5
,您需要从生成sums.md5
文件的同一目录中运行它。这是因为find
输出相对于您当前位置的路径,然后将其放入sums.md5
文件中。
If this is a problem you can make relative/path/to/dir
absolute (e.g. by puting $PWD/
in front of your path). This way you can run check on sums.md5
from any location. Disadvantage is, that now sums.md5
contains absolute paths, which makes it bigger.
如果这是一个问题,您可以将其relative/path/to/dir
设为绝对(例如,$PWD/
放在您的路径前面)。这样您就可以sums.md5
从任何位置运行检查。缺点是,现在sums.md5
包含绝对路径,这使得它更大。
Fully featured function using find
and md5sum
功能齐全的功能使用find
和md5sum
You can put this function to your .bashrc
file (located in your $HOME
directory):
您可以将此函数放入您的.bashrc
文件(位于您的$HOME
目录中):
function md5sums {
if [ "$#" -lt 1 ]; then
echo -e "At least one parameter is expected\n" \
"Usage: md5sums [OPTIONS] dir"
else
local OUTPUT="checksums.md5"
local CHECK=false
local MD5SUM_OPTIONS=""
while [[ $# > 1 ]]; do
local key=""
case $key in
-c|--check)
CHECK=true
;;
-o|--output)
OUTPUT=
shift
;;
*)
MD5SUM_OPTIONS="$MD5SUM_OPTIONS "
;;
esac
shift
done
local DIR=
if [ -d "$DIR" ]; then # if $DIR directory exists
cd $DIR # change to $DIR directory
if [ "$CHECK" = true ]; then # if -c or --check option specified
md5sum --check $MD5SUM_OPTIONS $OUTPUT # check MD5 sums in $OUTPUT file
else # else
find . -type f ! -name "$OUTPUT" -exec md5sum $MD5SUM_OPTIONS {} + > $OUTPUT # Calculate MD5 sums for files in current directory and subdirectories excluding $OUTPUT file and save result in $OUTPUT file
fi
cd - > /dev/null # change to previous directory
else
cd $DIR # if $DIR doesn't exists, change to it to generate localized error message
fi
fi
}
After you run source ~/.bashrc
, you can use md5sums
like normal command:
运行后source ~/.bashrc
,您可以md5sums
像普通命令一样使用:
md5sums path/to/dir
will generate checksums.md5
file in path/to/dir
directory, containing MD5 sums of all files in this directory and subdirectories. Use:
将checksums.md5
在path/to/dir
目录中生成文件,包含该目录和子目录中所有文件的MD5总和。用:
md5sums -c path/to/dir
to check sums from path/to/dir/checksums.md5
file.
检查path/to/dir/checksums.md5
文件中的总和。
Note that path/to/dir
can be relative or absolute, md5sums
will work fine either way. Resulting checksums.md5
file always contains paths relative to path/to/dir
.
You can use different file name then default checksums.md5
by supplying -o
or --output
option. All options, other then -c
, --check
, -o
and --output
are passed to md5sum
.
请注意,path/to/dir
可以是相对的或绝对的,md5sums
无论哪种方式都可以正常工作。结果checksums.md5
文件始终包含相对于path/to/dir
. 您可以checksums.md5
通过提供-o
或--output
选项使用与默认值不同的文件名。所有的选项,其他然后-c
,--check
,-o
并--output
传递给md5sum
。
First half of md5sums
function definition is responsible for parsing options. See this answerfor more information about it. Second half contains explanatory comments.
回答by taskalman
How about:
怎么样:
find /path/you/need -type f -exec md5sum {} \; > checksums.md5
find /path/you/need -type f -exec md5sum {} \; > checksums.md5
Update#1:Improved the command based on @twalberg's recommendation to handle white spaces in file names.
更新#1:根据@twalberg 的建议改进了命令以处理文件名中的空格。
Update#2:Improved based on @jil's suggestion, to remove unnecessary xargs
call and use -exec
option of find instead.
更新#2:根据@jil 的建议进行改进,删除不必要的xargs
调用并改用-exec
find 选项。
Update#3:@Blake a naive implementation of your script would look something like this:
更新#3:@Blake 一个简单的脚本实现看起来像这样:
#!/bin/bash
# Usage: checksumchecker.sh <path>
find "" -type f -exec md5sum {} \; > ""__checksums.md5
回答by jil
#!/bin/bash
shopt -s globstar
md5sum ""/** > "__checksums.md5"
Explanation: shopt -s globstar
(manual)enables **
recursive glob wildcard. It will mean that "$1"/**
will expand to list of all the files recursively under the directory given as parameter $1
. Then the script simply calls md5sum
with this file list as parameter and > "${1}__checksums.md5"
redirects the output to the file.
说明:(shopt -s globstar
手动)启用**
递归全局通配符。这将意味着"$1"/**
将递归扩展到作为参数给出的目录下的所有文件的列表$1
。然后脚本简单地md5sum
使用这个文件列表作为参数调用> "${1}__checksums.md5"
并将输出重定向到文件。
回答by Mark Setchell
Updated Answer
更新答案
If you like the answer below, or any of the others, you can make a function that does the command for you. So, to test it, type the following into Terminal to declare a function:
如果您喜欢下面的答案或其他任何答案,您可以创建一个为您执行命令的函数。因此,要对其进行测试,请在终端中键入以下内容以声明一个函数:
function sumthem(){ find "" -type f -print0 | parallel -0 -X md5 > checksums.md5; }
Then you can just use:
然后你可以使用:
sumthem /Users/somebody/somewhere
If that works how you like, you can add that line to the end of your "bash profile"and the function will be declared and available whenever you are logged in. Your "bash profile"is probably in $HOME/.profile
如果您喜欢这样,您可以将该行添加到“bash 配置文件”的末尾,并且该函数将在您登录时声明并可用。您的“bash 配置文件”可能在$HOME/.profile
Original Answer
原答案
Why not get all your CPU cores working in parallel for you?
为什么不让所有 CPU 内核为您并行工作?
find . -type f -print0 | parallel -0 -X md5sum
This finds all the files (-type f
) in the current directory (.
) and prints them with a null byte at the end. These are then passed passed into GNU Parallel, which is told that the filenames end with a null byte (-0
) and that it should do as many files as possible at a time (-X
) to save creating a new process for each file and it should md5sum the files.
这将查找-type f
当前目录( ) 中的所有文件 ( ).
并在末尾打印一个空字节。然后将这些传递给GNU Parallel,它被告知文件名以空字节 ( -0
)结尾,并且它应该一次处理尽可能多的文件 ( -X
) 以节省为每个文件创建一个新进程,它应该 md5sum文件。
This approach will pay the largest bonus, in terms off speed, with big images like Photoshop files.
对于像 Photoshop 文件这样的大图像,这种方法将在速度方面带来最大的好处。
回答by Alex Jurado - Bitendian
md5deep -r $your_directory | awk {'print '} | sort | md5sum | awk {'print '}