bash 一起使用 grep 和 awk
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/22865507/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Using grep and awk together
提问by duli
I have a file (A.txt) with 4 columns on numbers and another file with 3 columns of numbers (B.txt). I need to solve the following problems:
我有一个包含 4 列数字的文件 (A.txt) 和包含 3 列数字 (B.txt) 的另一个文件。我需要解决以下问题:
Find all lines in A.txt whose 3rd column has a number that appears any where in the 3rd column of B.txt.
Assume that I have many files like A.txt in a directory. I need to run this for every file in that directory.
查找 A.txt 中所有第 3 列的数字出现在 B.txt 的第 3 列中的任何位置的所有行。
假设我在一个目录中有很多像 A.txt 这样的文件。我需要为该目录中的每个文件运行它。
How do I do this?
我该怎么做呢?
采纳答案by slitvinov
Here is an example. Create the following files and run
这是一个例子。创建以下文件并运行
awk -f c.awk B.txt A*.txt
c.awk
c.awk
FNR==NR {
s[]
next
}
in s {
print FILENAME, 1 2 3
1 2 6
1 2 5
}
A1.txt
A1.txt
1 2 3
1 2 6
1 2 5
A2.txt
A2.txt
1 2 3
1 2 5
2 1 8
B.txt
B.txt
A1.txt 1 2 3
A1.txt 1 2 5
A2.txt 1 2 3
A2.txt 1 2 5
The output should be:
输出应该是:
grep "foo" file.txt | awk '{print }'
回答by David W.
You should never see someone using grep
and awk
together because whatever grep
can do, you can also do in awk
:
你永远不应该看到有人使用grep
和awk
一起使用,因为无论grep
可以做什么,你也可以在awk
:
Grep and Awk
Grep 和 awk
awk '/foo/ {print }' file.txt
Using Only Awk:
仅使用 awk:
use strict;
use warnings;
use feature qw(say);
use autodie;
my $b_file = shift;
open my $b_fh, "<", $b_file;
#
# This tracks the values in "B"
#
my %valid_lines;
while ( my $line = <$b_file> ) {
chomp $line;
my @array = split /\s+/, $line;
$valid_lines{$array[2]} = 1; #Third column
}
close $b_file;
#
# This handles the rest of the files
#
while ( my $line = <> ) { # The rest of the files
chomp $line;
my @array = split /\s+/, $line;
next unless exists $valid_lines{$array[2]}; # Next unless field #3 was in b.txt too
say $line;
}
I had to get that off my chest. Now to your problem...
我不得不把它从我的胸膛上拿开。现在你的问题...
Awk is a programming language that assumes a single loop through all the lines in a set of files. And, you don't want to do this. Instead, you want to treat B.txt
as a special file and loop though your other files. That normally calls for something like Python or Perl. (Older versions of BASH didn't handle hashed key arrays, so these versions of BASH won't work.) However, slitvinovlooks like he found an answer.
awk 是一种编程语言,它假定对一组文件中的所有行进行单个循环。而且,你不想这样做。相反,您希望将其B.txt
视为特殊文件并循环遍历其他文件。这通常需要像 Python 或 Perl 这样的东西。(旧版本的 BASH 不处理散列键数组,因此这些版本的 BASH 无法工作。)但是,slitvinov看起来似乎找到了答案。
Here's a Perl solution anyway:
无论如何,这是一个Perl解决方案:
##代码##