bash 在bash脚本中解析CSV文件

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/17849394/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 23:59:12  来源:igfitidea点击:

Parsing CSV file in bash script

bashcsvsedawk

提问by user1358062

I am trying to parse in a CSV file which contains a typical access control matrix table into a shell script. My sample CSV file would be

我正在尝试将包含典型访问控制矩阵表的 CSV 文件解析为 shell 脚本。我的示例 CSV 文件将是

"user","admin","security"  
"user1","x",""  
"user2","","x"  
"user3","x","x"

I would be using this list in order to create files in their respective folders. The problem is how do I get it to store the values of column 2/3 (admin/security)? The output I'm trying to achieve is to group/sort all users that have admin/security rights and create files in their respective folders. (My idea is to probably store all admin/security users into different files and run from there.)

我将使用此列表在各自的文件夹中创建文件。问题是如何让它存储第 2/3 列(管理员/安全)的值?我试图实现的输出是对所有具有管理员/安全权限的用户进行分组/排序,并在各自的文件夹中创建文件。(我的想法是可能将所有管理员/安全用户存储到不同的文件中并从那里运行。)

The environment does not allow me to use any Perl or Python programs. However any awkor sedcommands are greatly appreciated.

环境不允许我使用任何 Perl 或 Python 程序。然而,任何awksed命令都非常感谢。

My desired output would be

我想要的输出是

$ cat sample.csv
"user","admin","security"
"user1","x",""
"user2","","x"
"user3","x","x"
$ cat security.csv
user2
user3
$ cat admin.csv
user1
user3

$ cat sample.csv
"user","admin","security"
"user1","x",""
"user2","","x"
"user3","x","x"
$ cat security .csv
user2
user3
$ cat admin.csv
user1
user3

回答by Justin L.

if you can use cut(1)(which you probably can if you're on any type of unix) you can use

如果您可以使用cut(1)(如果您使用的是任何类型的 unix,您可能可以使用)您可以使用

cut -d , -f (n) (file)

where nis the column you want.

n你想要的列在哪里。

You can use a range of columns (2-3) or a list of columns (1,3).

您可以使用一系列列 ( 2-3) 或列列表 ( 1,3)。

This will leave the quotes but you can use a sed command or something light-weight for that.

这将留下引号,但您可以使用 sed 命令或一些轻量级的命令。

$ cat sample.csv
"user","admin","security"
"user1","x",""
"user2","","x"
"user3","x","x"

$ cut -d , -f 2 sample.csv
"admin"
"x"
""
"x"

$ cut -d , -f 3 sample.csv
"security"
""
"x"
"x"

$ cut -d , -f 2-3 sample.csv
"admin","security"
"x",""
"","x"
"x","x"

$ cut -d , -f 1,3 sample.csv
"user","security"
"user1",""
"user2","x"
"user3","x"

note that this won't work for general csv files (doesn't deal with escaped commas) but it should work for files similar to the format in the example for simple usernames and x's.

请注意,这不适用于一般的 csv 文件(不处理转义逗号),但它应该适用于类似于示例中的简单用户名和 x 格式的文件。



if you want to just grab the list of usernames, then awkis pretty much the tool made for the job, and an answer below does a good job that I don't need to repeat.

如果您只想获取用户名列表,那么awk这几乎就是为这项工作制作的工具,下面的答案做得很好,我不需要重复。

But a grep solution might be quicker and more lightweight

但是 grep 解决方案可能更快更轻量级

The grepsolution:

grep解决方案:

grep '^\([^,]\+,\)\{N\}"x"'

where Nis the Nth column, with the users being column 0.

其中N是第 N 列,用户为第 0 列。

$ grep '^\([^,]\+,\)\{1\}"x"' sample.csv
"user1","x",""
"user3","x","x"

$ grep '^\([^,]\+,\)\{2\}"x"' sample.csv
"user2","","x"
"user3","x","x"

from there on you can use cutto get the first column:

从那里你可以cut用来获取第一列:

$ grep '^\([^,]\+,\)\{1\}"x"' sample.csv | cut -d , -f 1
"user1"
"user3"

and sed 's/"//g'to get rid of quotes:

sed 's/"//g'摆脱引号:

$ grep '^\([^,]\+,\)\{1\}"x"' sample.csv | cut -d , -f 1 | sed 's/"//g'
user1
user3

$ grep '^\([^,]\+,\)\{2\}"x"' sample.csv | cut -d , -f 1 | sed 's/"//g'
user2
user3

回答by jaypal singh

Something to get you started (please note this will not work for csv files with embedded commas and you will have to use a csv parser):

让你开始的东西(请注意,这不适用于带有嵌入逗号的 csv 文件,你必须使用 csv 解析器):

awk -F, '
NR>1 { 
  gsub(/["]/,"",##代码##); 
  if(!="" && !="") 
    print  " has both privileges"; 
    print  > "file"
}' csv