使用 Bash 的两个列表之间的区别

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/11165182/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 22:18:51  来源:igfitidea点击:

Difference between two lists using Bash

bashsortingsedawkgrep

提问by exvance

Ok, I have two related lists on my linux box in text files:

好的,我的 linux 盒子上有两个相关的文本文件列表:

 /tmp/oldList
 /tmp/newList

I need to compare these lists to see what lines got added and what lines got removed. I then need to loop over these lines and perform actions on them based on whether they were added or removed.

我需要比较这些列表以查看添加了哪些行以及删除了哪些行。然后我需要遍历这些行并根据它们是被添加还是被删除对它们执行操作。

How do I do this in bash?

我如何在 bash 中做到这一点?

回答by camh

Use the comm(1)command to compare the two files. They both need to be sorted, which you can do beforehand if they are large, or you can do it inline with bash process substitution.

使用comm(1)命令比较两个文件。它们都需要排序,如果它们很大,您可以事先进行排序,或者您可以使用 bash进程替换内联进行排序。

commcan take a combination of the flags -1, -2and -3indicating which file to suppress lines from (unique to file 1, unique to file 2 or common to both).

comm可以采用 flags 的组合-1-2-3指示从哪个文件中抑制行(文件 1 独有、文件 2 独有或两者共有)。

To get the lines only in the old file:

要仅获取旧文件中的行:

comm -23 <(sort /tmp/oldList) <(sort /tmp/newList)

To get the lines only in the new file:

仅获取新文件中的行:

comm -13 <(sort /tmp/oldList) <(sort /tmp/newList)

You can feed that into a while readloop to process each line:

您可以将其输入到while read循环中以处理每一行:

while read old ; do
    ...do stuff with $old
done < <(comm -23 <(sort /tmp/oldList) <(sort /tmp/newList))

and similarly for the new lines.

新线路也类似。

回答by Levon

The diff commandwill do the comparing for you.

diff命令会做比较适合你。

e.g.,

例如,

$ diff /tmp/oldList /tmp/newList

See the above man page link for more information. This should take care of your first part of your problem.

有关更多信息,请参阅上面的手册页链接。这应该可以解决您问题的第一部分。

回答by Nowaker

Consider using Ruby if your scripts need readability.

如果您的脚本需要可读性,请考虑使用 Ruby。

To get the lines only in the old file:

要仅获取旧文件中的行:

ruby -e "puts File.readlines('/tmp/oldList') - File.readlines('/tmp/newList')"

To get the lines only in the new file:

仅获取新文件中的行:

ruby -e "puts File.readlines('/tmp/newList') - File.readlines('/tmp/oldList')"

You can feed that into a while read loop to process each line:

您可以将其输入到 while 读取循环中以处理每一行:

while read old ; do
  ...do stuff with $old
done < ruby -e "puts File.readlines('/tmp/oldList') - File.readlines('/tmp/newList')"

回答by Costa Tsaousis

This is old, but for completeness we should say that if you have a really large set, the fastest solution would be to use diff to generate a script and then source it, like this:

这是旧的,但为了完整起见,我们应该说,如果您有一个非常大的集合,最快的解决方案是使用 diff 生成脚本,然后将其作为源,如下所示:

#!/bin/bash

line_added() {
   # code to be run for all lines added
   # $* is the line 
}

line_removed() {
   # code to be run for all lines removed
   # $* is the line 
}

line_same() {
   # code to be run for all lines at are the same
   # $* is the line 
}

cat /tmp/oldList | sort >/tmp/oldList.sorted
cat /tmp/newList | sort >/tmp/newList.sorted

diff >/tmp/diff_script.sh \
    --new-line-format="line_added %L" \
    --old-line-format="line_removed %L" \
    --unchanged-line-format="line_same %L" \
    /tmp/oldList.sorted /tmp/newList.sorted

source /tmp/diff_script.sh

Lines changed will appear as deleted and added. If you don't like this, you can use --changed-group-format. Check the diff manual page.

更改的行将显示为已删除和已添加。如果你不喜欢这个,你可以使用--changed-group-format。检查差异手册页。

回答by Nathan

I typically use:

我通常使用:

diff /tmp/oldList /tmp/newList | grep -v "Common subdirectories"

The grep -voption inverts the match:

grep -v选项反转匹配:

-v, --invert-match Selected lines are those not matching any of the specified pat- terns.

-v, --invert-match 选定的行是那些不匹配任何指定模式的行。

So in this case it takes the diffresults and omits those that are common.

因此,在这种情况下,它会获取diff结果并忽略那些常见的结果。

回答by ssedano

Have you tried diff

你有没有尝试过 diff

$ diff /tmp/oldList /tmp/newList

$ man diff