在 bash (Linux) 中从一个 csv 中的另一个（如 vlookup）中查找值

Question

提问by Yasapl

I have already tried all options that I found online to solve my issue but without good result.

我已经尝试了我在网上找到的所有选项来解决我的问题，但没有很好的结果。

Basically I have two csv files (pipe separated):

基本上我有两个 csv 文件（管道分隔）：

file1.csv:

文件1.csv：

123|21|0452|IE|IE|1|MAYOBAN|BRIN|OFFICE|STREET|MAIN STREET|MAYOBAN|

123|21|0452|IE|IE|1|MAYOBAN|BRIN|办公室|街道|主要街道|MAYOBAN|

123|21|0453|IE|IE|1|CORKKIN|ROBERT|SURNAME|CORK|APTS|CORKKIN|

123|21|0453|IE|IE|1|科金|罗伯特|姓氏|科克|APTS|科金|

123|21|0452|IE|IE|1|CORKCOR|NAME|HARRINGTON|DUBLIN|STREET|CORKCOR|

123|21|0452|IE|IE|1|科克科尔|姓名|哈灵顿|都柏林|街道|科克科尔|

file2.csv:

文件2.csv：

MAYOBAN|BANGOR|2400

MAYOBAN|班戈|2400

MAYOBEL|BELLAVARY|2400

CORKKIN|KINSALE|2200

软木|KINSALE|2200

CORKCOR|CORK|2200

软木塞|软木塞|2200

DUBLD11|DUBLIN 11|2100

都柏林11|都柏林11|2100

I need a linux bash script to find the value of pos.3 from file2 based on the content of pos7 in file1.

我需要一个linux bash脚本根据file1中pos7的内容从file2中找到pos.3的值。

Example: file1, line1, pos 7: MAYOBAN find MAYOBAN in file2, return pos 3 (2400)

示例：file1, line1, pos 7: MAYOBAN 在 file2 中找到 MAYOBAN，返回 pos 3 (2400)

the output should be something like this:

输出应该是这样的：

2400

2200

etc...

等等...

Please help Jacek

请帮助Jacek

Answer 1

回答by sgibb

A little approach, far away to be perfect:

一点办法，远未完美：

DELIMITER="|"

for i in $(cut -f 7 -d "${DELIMITER}" file1.csv ); 
do 
    grep "${i}" file2.csv | cut -f 3 -d "${DELIMITER}"; 
done

Answer 2

回答by Paused until further notice.

This will work, but since the input files must be sorted, the output order will be affected:

这会起作用，但由于必须对输入文件进行排序，因此输出顺序将受到影响：

join -t '|' -1 7 -2 1 -o 2.3 <(sort -t '|' -k7,7 file1.csv) <(sort -t '|' -k1,1 file2.csv)

The output would look like:

输出将如下所示：

2200
2200
2400

which is useless. In order to have a useful output, include the key value:

这是没用的。为了获得有用的输出，请包含键值：

join -t '|' -1 7 -2 1 -o 0,2.3 <(sort -t '|' -k7,7 file1.csv) <(sort -t '|' -k1,1 file2.csv)

The output then looks like this:

输出如下所示：

CORKCOR|2200
CORKKIN|2200
MAYOBAN|2400

Edit:

编辑：

Here's an AWK version:

这是一个 AWK 版本：

awk -F '|' 'FNR == NR {keys[]; next} {if ( in keys) print }' file1.csv file2.csv

This loops through file1.csv and creates array entries for each value of field 7. Simply referring to an array element creates it (with a null value). FNRis the record number in the current file and NRis the record number across all files. When they're equal, the first file is being processed. The nextinstruction reads the next record, creating a loop. When FNR == NRis no longer true, the subsequent file(s) are processed.

这将遍历 file1.csv 并为字段 7 的每个值创建数组条目。只需引用一个数组元素即可创建它（具有空值）。FNR是当前文件NR中的记录号，也是所有文件中的记录号。当它们相等时，正在处理第一个文件。该next指令读取下一条记录，创建一个循环。当FNR == NR不再为真时，处理后续文件。

So file2.csv is now processed and if it has a field 1 that exists in the array, then its field 3 is printed.

所以 file2.csv 现在被处理，如果它有一个存在于数组中的字段 1，那么它的字段 3 被打印出来。

Answer 3

回答by dexnow

cut -d\| -f7 file1.csv|while read line
do 
  grep $line file1.csv|cut -d\| -f3
done

在 bash (Linux) 中从一个 csv 中的另一个（如 vlookup）中查找值

提问by Yasapl

回答by sgibb

回答by Paused until further notice.

回答by dexnow

相关推荐

最近更新

标签

在 bash (Linux) 中从一个 csv 中的另一个（如 vlookup）中查找值

提问by Yasapl

回答by sgibb

回答by Paused until further notice.

回答by dexnow

相关推荐

Bash - 如何获取当前文件夹中文件夹的名称？

bash unix shell 脚本中的 cat 命令

bash 遍历目录的shell脚本

bash 解析文件名

相关推荐

最近更新

标签