Linux 如何使用AWK合并两个文件？

Question

提问by Tony

File 1 has 5 fields A B C D E, with field A is an integer-valued

文件 1 有 5 个字段 ABCDE，其中字段 A 是一个整数值

File 2 has 3 fields A F G

文件 2 有 3 个字段 AFG

The number of rows in File 1 is much bigger than that of File 2 (20^6 to 5000)

文件 1 的行数远大于文件 2 的行数（20^6 到 5000）

All the entries of A in File 1 appeared in field A in File 2

文件 1 中 A 的所有条目都出现在文件 2 中的字段 A 中

I like to merge the two files by field A and carry F and G

我喜欢按字段A合并两个文件并携带F和G

Desired output is A B C D E F G

期望的输出是 ABCDEFG

Example

例子

File 1

文件 1

 A     B     C    D    E
4050 S00001 31228 3286 0
4050 S00012 31227 4251 0
4049 S00001 28342 3021 1
4048 S00001 46578 4210 0
4048 S00113 31221 4250 0
4047 S00122 31225 4249 0
4046 S00344 31322 4000 1

File 2

档案 2

A     F    G   
4050 12.1 23.6
4049 14.4 47.8   
4048 23.2 43.9
4047 45.5 21.6

Desired output

期望输出

A    B      C      D   E F    G
4050 S00001 31228 3286 0 12.1 23.6
4050 S00012 31227 4251 0 12.1 23.6
4049 S00001 28342 3021 1 14.4 47.8
4048 S00001 46578 4210 0 23.2 43.9
4048 S00113 31221 4250 0 23.2 43.9
4047 S00122 31225 4249 0 45.5 21.6

Answer 1

采纳答案by kurumi

$ awk 'FNR==NR{a[]= FS ;next}{ print BEGIN { while (getline < "File 2") { f[] = ; g[] =  } }
, a[]}' file2 file1
4050 S00001 31228 3286 0 12.1 23.6
4050 S00012 31227 4251 0 12.1 23.6
4049 S00001 28342 3021 1 14.4 47.8
4048 S00001 46578 4210 0 23.2 43.9
4048 S00113 31221 4250 0 23.2 43.9
4047 S00122 31225 4249 0 45.5 21.6
4046 S00344 31322 4000 1

Answer 2

回答by Jonathan Leffler

You need to read the entries from File 2 into a pair of associative arrays in the BEGIN block. Assuming GNU Awk:

您需要将文件 2 中的条目读入 BEGIN 块中的一对关联数组。假设 GNU Awk：

{ print awk 'BEGIN { while (getline < "File 2") { f[] = ; g[] =  } }
     print join -1 1 -2 1 File1 File2
, f[], g[] }' "File 1"
, f[], g[] }

In the main processing block, you read the line from File 1 and print it with the correct data from the arrays created in the BEGIN block:

在主处理块中，您读取文件 1 中的行，并使用 BEGIN 块中创建的数组中的正确数据打印它：

will-hartungs-computer:tmp will$ cat f1
4050 S00001 31228 3286 0
4050 S00012 31227 4251 0
4049 S00001 28342 3021 1
4048 S00001 46578 4210 0
4048 S00113 31221 4250 0
4047 S00122 31225 4249 0
4046 S00344 31322 4000 1
will-hartungs-computer:tmp will$ cat f2
4050 12.1 23.6
4049 14.4 47.8   
4048 23.2 43.9
4047 45.5 21.6
will-hartungs-computer:tmp will$ join -1 1 -2 1 f1 f2
4050 S00001 31228 3286 0 12.1 23.6
4050 S00012 31227 4251 0 12.1 23.6
4049 S00001 28342 3021 1 14.4 47.8
4048 S00001 46578 4210 0 23.2 43.9
4048 S00113 31221 4250 0 23.2 43.9
4047 S00122 31225 4249 0 45.5 21.6
will-hartungs-computer:tmp will$

Supply File 1 as the filename argument to the program.

提供文件 1 作为程序的文件名参数。

awk 'BEGIN{OFS=","}  FNR==NR {F[]=;G[]=;next} {print ,,,,,F[],G[]}' file2.txt file1.txt

The quotes around the file name argument are needed because of the spaces in the file name. You need the quotes around the getlinefilename even if it contained no spaces as it would otherwise be a variable name.

由于文件名中有空格，文件名参数周围需要引号。getline即使文件名不包含空格，您也需要在文件名周围加上引号，否则它将是一个变量名。

Answer 3

回答by Will Hartung

Thankfully, you don't need to write this at all. Unix has a join command to do this for you.

谢天谢地，你根本不需要写这个。Unix 有一个 join 命令来为你做这件事。

##代码##

Here it is "in action":

这是“在行动”：

##代码##

Answer 4

回答by NAGAPPA

##代码##

Linux 如何使用AWK合并两个文件？

提问by Tony

采纳答案by kurumi

回答by Jonathan Leffler

回答by Will Hartung

回答by NAGAPPA

相关推荐

最近更新

标签

Linux 如何使用AWK合并两个文件？

提问by Tony

采纳答案by kurumi

回答by Jonathan Leffler

回答by Will Hartung

回答by NAGAPPA

相关推荐

在 C# 中将整个对象转储到日志的最佳方法是什么？

Linux 如何在 Vimdiff 中展开/折叠差异部分？

Linux 在 Ubuntu 中找不到 X11/Xlib.h

C# 为什么 GetProperty 找不到它？

相关推荐

最近更新

标签