bash 将 2 个 Unix 文件和输出匹配行与新文件进行比较?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/8722103/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Compare 2 Unix Files and Output Matching Lines to a New File?
提问by rreeves
I have 2 nix files. All of the data is on one single line in each file. Each value is separated by a null character. Some off the values in the data match.
我有 2 个 nix 文件。所有数据都在每个文件的一行中。每个值由一个空字符分隔。一些关闭数据中的值匹配。
How would I parse this data into a new file listing only the matching values ?
我将如何将此数据解析为仅列出匹配值的新文件?
I figure I could use sed to change the null characters into newlines ? From there on I'm not real sure...
我想我可以使用 sed 将空字符更改为换行符吗?从那以后我不太确定......
Any ideas ?
有任何想法吗 ?
回答by holygeek
Use tr, sortand comm:
使用tr,sort和comm:
Convert nulls into new lines, and sort the result:
将空值转换为新行,并对结果进行排序:
$ tr '$ comm -1 -2 file1.txt file2.txt
<lines shown here are the common lines between file1.txt and file2.txt>
0' '\n' < file1 | sort > file1.txt
$ tr '( tr 'comm -1 -2 <(tr 'parallel 'tr "##代码##0" "\n" <{} | sort -u' ::: file{1,2} | sort | uniq -d
' '\n' < file1) <(tr '##代码##' '\n' < file2)
' '\n' < file1; tr '##代码##' '\n' < file2 ) | sort | uniq -c | egrep -v '^ +1'
0' '\n' < file2 | sort > file2.txt
then use commto get the lines that are common to both file:
然后用于comm获取两个文件共有的行:
回答by Barton Chittenden
If there are no duplicate values within file1 or file2, you can do this:
如果 file1 或 file2 中没有重复值,您可以这样做:
##代码##This will count all of the duplicate values between the two files.
这将计算两个文件之间的所有重复值。
If the order of the fields is important, you can do this:
如果字段的顺序很重要,您可以这样做:
##代码##This approach is not portable, it requires the 'process substitution' feature of Bash.
This approach is not portable, it requires the 'process substitution' feature of Bash.
回答by potong
This might work for you:
This might work for you:
##代码##
