使用 Bash 的两个列表之间的区别
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/11165182/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Difference between two lists using Bash
提问by exvance
Ok, I have two related lists on my linux box in text files:
好的,我的 linux 盒子上有两个相关的文本文件列表:
/tmp/oldList
/tmp/newList
I need to compare these lists to see what lines got added and what lines got removed. I then need to loop over these lines and perform actions on them based on whether they were added or removed.
我需要比较这些列表以查看添加了哪些行以及删除了哪些行。然后我需要遍历这些行并根据它们是被添加还是被删除对它们执行操作。
How do I do this in bash?
我如何在 bash 中做到这一点?
回答by camh
Use the comm(1)
command to compare the two files. They both need to be sorted, which you can do beforehand if they are large, or you can do it inline with bash process substitution.
使用comm(1)
命令比较两个文件。它们都需要排序,如果它们很大,您可以事先进行排序,或者您可以使用 bash进程替换内联进行排序。
comm
can take a combination of the flags -1
, -2
and -3
indicating which file to suppress lines from (unique to file 1, unique to file 2 or common to both).
comm
可以采用 flags 的组合-1
,-2
并-3
指示从哪个文件中抑制行(文件 1 独有、文件 2 独有或两者共有)。
To get the lines only in the old file:
要仅获取旧文件中的行:
comm -23 <(sort /tmp/oldList) <(sort /tmp/newList)
To get the lines only in the new file:
仅获取新文件中的行:
comm -13 <(sort /tmp/oldList) <(sort /tmp/newList)
You can feed that into a while read
loop to process each line:
您可以将其输入到while read
循环中以处理每一行:
while read old ; do
...do stuff with $old
done < <(comm -23 <(sort /tmp/oldList) <(sort /tmp/newList))
and similarly for the new lines.
新线路也类似。
回答by Levon
The diff commandwill do the comparing for you.
该diff命令会做比较适合你。
e.g.,
例如,
$ diff /tmp/oldList /tmp/newList
See the above man page link for more information. This should take care of your first part of your problem.
有关更多信息,请参阅上面的手册页链接。这应该可以解决您问题的第一部分。
回答by Nowaker
Consider using Ruby if your scripts need readability.
如果您的脚本需要可读性,请考虑使用 Ruby。
To get the lines only in the old file:
要仅获取旧文件中的行:
ruby -e "puts File.readlines('/tmp/oldList') - File.readlines('/tmp/newList')"
To get the lines only in the new file:
仅获取新文件中的行:
ruby -e "puts File.readlines('/tmp/newList') - File.readlines('/tmp/oldList')"
You can feed that into a while read loop to process each line:
您可以将其输入到 while 读取循环中以处理每一行:
while read old ; do
...do stuff with $old
done < ruby -e "puts File.readlines('/tmp/oldList') - File.readlines('/tmp/newList')"
回答by Costa Tsaousis
This is old, but for completeness we should say that if you have a really large set, the fastest solution would be to use diff to generate a script and then source it, like this:
这是旧的,但为了完整起见,我们应该说,如果您有一个非常大的集合,最快的解决方案是使用 diff 生成脚本,然后将其作为源,如下所示:
#!/bin/bash
line_added() {
# code to be run for all lines added
# $* is the line
}
line_removed() {
# code to be run for all lines removed
# $* is the line
}
line_same() {
# code to be run for all lines at are the same
# $* is the line
}
cat /tmp/oldList | sort >/tmp/oldList.sorted
cat /tmp/newList | sort >/tmp/newList.sorted
diff >/tmp/diff_script.sh \
--new-line-format="line_added %L" \
--old-line-format="line_removed %L" \
--unchanged-line-format="line_same %L" \
/tmp/oldList.sorted /tmp/newList.sorted
source /tmp/diff_script.sh
Lines changed will appear as deleted and added. If you don't like this, you can use --changed-group-format. Check the diff manual page.
更改的行将显示为已删除和已添加。如果你不喜欢这个,你可以使用--changed-group-format。检查差异手册页。
回答by Nathan
I typically use:
我通常使用:
diff /tmp/oldList /tmp/newList | grep -v "Common subdirectories"
The grep -v
option inverts the match:
该grep -v
选项反转匹配:
-v, --invert-match Selected lines are those not matching any of the specified pat- terns.
-v, --invert-match 选定的行是那些不匹配任何指定模式的行。
So in this case it takes the diff
results and omits those that are common.
因此,在这种情况下,它会获取diff
结果并忽略那些常见的结果。
回答by ssedano
Have you tried diff
你有没有尝试过 diff
$ diff /tmp/oldList /tmp/newList
$ man diff