使用 bash (sed/awk) 提取 CSV 文件中的行和列?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/14492590/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 23:15:19  来源:igfitidea点击:

using bash (sed/awk) to extract rows AND columns in CSV files?

bashcsvsedawk

提问by user1899415

Is bash capable of handling extracting rows and columns from csv files? Hoping I don't have to resort to python..

bash 是否能够处理从 csv 文件中提取行和列?希望我不必求助于python..

My 5-column csv file looks like:

我的 5 列 csv 文件如下所示:

Rank,Name,School,Major,Year
1,John,Harvard,Computer Science,3
2,Bill,Yale,Political Science,4
3,Mark,Stanford,Biology,1
4,Jane,Princeton,Electrical Engineering,3
5,Alex,MIT,Management Economics,2

I only want to extract the 3rd, 4th, and 5th column contents, ignoring the first row, so output looks like:

我只想提取第 3、4 和 5 列内容,忽略第一行,因此输出如下所示:

Harvard,Computer Science,3
Yale,Political Science,4
Stanford,Biology,1
Princeton,Electrical Engineering,3
MIT,Management Economics,2

So far I can only get awk to print out either each row, or each column of my CSV file, but not specific cols/rows like this case! Can bash do this?

到目前为止,我只能让 awk 打印出我的 CSV 文件的每一行或每一列,而不是像这种情况下的特定列/行!bash 可以这样做吗?

采纳答案by koola

Bash solutions;

Bash 解决方案;

Using IFS

使用 IFS

#!/bin/bash
while IFS=',' read -r rank name school major year; do
    echo -e "Rank\t: $rank\nName\t: $name\nSchool\t: $school\nMajor\t: $major\nYear\t: $year\n"
done < file.csv
IFS=$' \t\n'

Using String Manipulation and Arrays

使用字符串操作和数组

#!/bin/bash
declare -a arr
while read -r line; do
    arr=(${line//,/ })
    printf "Rank\t: %s\nName\t: %s\nSchool\t: %s\nMajor\t: %s\nYear\t: %s\n" ${arr[@]}
done < file.csv

回答by that other guy

awk -F, 'NR > 1 { print  ","  ","  }' 

NR is the current line number, while $3, $4 and $5 are the fields separated by the string given to -F

NR 是当前行号,而 $3、$4 和 $5 是由 -F 的字符串分隔的字段

回答by hennr

Try this:

尝试这个:

tail -n+2 file.csv | cut --delimiter=, -f3-5

回答by Rubens

Use cutand tail:

使用cuttail

tail -n +2 file.txt | cut -d ',' -f 3-

回答by glenn Hymanman

sed 1d file.csv | while IFS=, read first second rest; do echo "$rest"; done

回答by Vijay

perl -F, -lane 'if($.!=1){print join ",",@F[2,3,4];}' your_file

check here

检查这里

回答by potong

This might work for you (GNU sed):

这可能对你有用(GNU sed):

sed -r '1d;s/([^,]*,){2}//' file

回答by Mirage

try this

尝试这个

awk -F, 'NR > 1 { OFS=",";print , ,  }' temp.txt

or this

或这个

sed -re '1d;s/^[0-9],\w+,//g' temp.txt

回答by steveha

Here you go, a simple AWK program.

给你,一个简单的 AWK 程序。

#!/usr/bin/awk -f

BEGIN {
    # set field separator to comma to split CSV fields
    FS = ","
}

# NR > 1 skips the first line
NR > 1 {
    # print only the desired fields
    printf("%s,%s,%s\n", , , )
}

回答by welldan97

I have created package for this kind of tasks - gumbaIf you feel comfortable with coffeescript you can give it a try

我已经为此类任务创建了包-gumba如果您对咖啡脚本感到满意,可以尝试一下

cat file.csv | tail -n +2 | \
gumba "words(',').take((words)-> words.last(3)).join(',')"`