Linux 如何在 Bash 中解析 CSV 文件?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4286469/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to parse a CSV file in Bash?
提问by User1
I'm working on a long Bash script. I want to read cells from a CSV file into Bash variables. I can parse lines and the first column, but not any other column. Here's my code so far:
我正在编写一个很长的 Bash 脚本。我想将 CSV 文件中的单元格读取到 Bash 变量中。我可以解析行和第一列,但不能解析任何其他列。到目前为止,这是我的代码:
cat myfile.csv|while read line
do
read -d, col1 col2 < <(echo $line)
echo "I got:$col1|$col2"
done
It's only printing the first column. As an additional test, I tried the following:
它只打印第一列。作为附加测试,我尝试了以下操作:
read -d, x y < <(echo a,b,)
read -d, x y < <(echo a,b,)
And $y is empty. So I tried:
而 $y 是空的。所以我试过:
read x y < <(echo a b)
read x y < <(echo a b)
And $y is b
. Why?
而 $y 是b
。为什么?
采纳答案by Paused until further notice.
You need to use IFS
instead of -d
:
您需要使用IFS
代替-d
:
while IFS=, read -r col1 col2
do
echo "I got:$col1|$col2"
done < myfile.csv
Note that for general purpose CSV parsing you should use a specialized tool which can handle quoted fields with internal commas, among other issues that Bash can't handle by itself. Examples of such tools are cvstool
and csvkit
.
请注意,对于通用 CSV 解析,您应该使用专门的工具,该工具可以处理带有内部逗号的引用字段,以及 Bash 本身无法处理的其他问题。此类工具的示例是cvstool
和csvkit
。
回答by dogbane
From the man
page:
从man
页面:
-d delim The first character of delim is used to terminate the input line, rather than newline.
-d delim delim 的第一个字符用于终止输入行,而不是换行符。
You are using -d,
which will terminate the input line on the comma. It will not read the rest of the line. That's why $y is empty.
您正在使用-d,
which 将终止逗号上的输入行。它不会读取该行的其余部分。这就是为什么 $y 是空的。
回答by Maithilish
We can parse csv files with quoted strings and delimited by say | with following code
我们可以用带引号的字符串解析 csv 文件并用 say | 分隔。使用以下代码
while read -r line
do
field1=$(echo $line | awk -F'|' '{printf "%s", }' | tr -d '"')
field2=$(echo $line | awk -F'|' '{printf "%s", }' | tr -d '"')
echo $field1 $field2
done < $csvFile
awk parse the string fields to variables and tr removes the quote.
awk 将字符串字段解析为变量,tr 删除引号。
Slightly slower as awk is executed for each field.
为每个字段执行 awk 时速度稍慢。
回答by Eliya
If you want to read CSV file with some lines, so this the solution.
如果您想读取带有某些行的 CSV 文件,那么这就是解决方案。
while IFS=, read -ra line
do
test $i -eq 1 && ((i=i+1)) && continue
for col_val in ${line[@]}
do
echo -n "$col_val|"
done
echo
done < "$csvFile"