bash 在bash中计算一行中的逗号

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/10817439/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 22:10:21  来源:igfitidea点击:

Counting commas in a line in bash

bashshell

提问by Stuart Woodward

Sometimes I receive a CSV file which has a carriage return inside a cell. This is not an acceptable format to a program that will use it as input.

有时我会收到一个 CSV 文件,其中在单元格内有回车符。对于将其用作输入的程序来说,这不是可接受的格式。

In order to detect if an input line is split, I determined that a bad line would not have the expected number of commas in it. Is there a bash or other common unix command line tool that would allow me to count the commas in the line? If necessary, I can write a Python or Perl program to do it, but if possible, I'd like to add a line or two to an existing bash script to cause it to fail if the comma count is wrong. Any ideas?

为了检测输入行是否被拆分,我确定坏行中不会包含预期数量的逗号。是否有 bash 或其他常见的 unix 命令行工具可以让我计算行中的逗号?如有必要,我可以编写一个 Python 或 Perl 程序来执行此操作,但如果可能,我想在现有的 bash 脚本中添加一两行,以在逗号计数错误时使其失败。有任何想法吗?

回答by lanzz

Strip everything but the commas, and then count number of characters left:

去除除逗号之外的所有内容,然后计算剩余的字符数:

$ echo foo,bar,baz | tr -cd , | wc -c
2

回答by Jon Lin

To count the number of times a comma appears, you can use something like awk:

要计算逗号出现的次数,您可以使用类似 awk 的方法:

string=(line of input from CSV file)
echo "$string" | awk -F "," '{print NF-1}'

But this really isn't sufficient to determine whether a field has carriage returns in it. Fields can have commas inside as long as they're surrounded by quotes.

但这确实不足以确定字段中是否包含回车符。字段可以包含逗号,只要它们被引号包围即可。

回答by Paused until further notice.

In pure Bash:

在纯 Bash 中:

while IFS=, read -ra array
do
    echo "$((${#array[@]} - 1))"
done < inputfile

or

或者

while read -r line
do
    count=${line//[^,]}
    echo "${#count}"
done < inputfile

回答by ceving

Try Perl:

试试 Perl:

$ perl -ne 'print 0+@{[/,/g]},"\n"'
a
0
a,a
1
a,a,a,a,a
4

回答by D Bro

Depending on what you are trying to do with the CSV data, it may be helpful to use a wrapper script like csvquote to temporarily replace the problematic newlines (and commas) inside quoted fields, then restore them. For instance:

根据您尝试对 CSV 数据执行的操作,使用 csvquote 之类的包装脚本临时替换引用字段内有问题的换行符(和逗号),然后恢复它们可能会有所帮助。例如:

csvquote inputfile.csv | wc -l

and

csvquote inputfile.csv | cut -d, -f1 | csvquote -u

may be the sort of thing you're looking for. See [https://github.com/dbro/csvquote][1]for the code and more information

可能是你正在寻找的那种东西。有关[https://github.com/dbro/csvquote][1]代码和更多信息,请参阅

回答by Hunter McMillen

Just remove all of the carriage returns:

只需删除所有回车:

tr -d "\r" old_file > new_file