在 bash/unix 中删除 CSV 文件所有列的空格
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/38609590/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Removing spaces for all the columns of a CSV file in bash/unix
提问by stephenjacob
I have a CSV file in which every column contains unnecessary extra spaces added to it before the actual value. I want to create a new CSV file by removing all the spaces.
我有一个 CSV 文件,其中每一列都包含在实际值之前添加的不必要的额外空格。我想通过删除所有空格来创建一个新的 CSV 文件。
For example
例如
One line in input CSV file
输入 CSV 文件中的一行
123, ste hen, 456, out put
Expected output CSV file
预期输出 CSV 文件
123,ste hen,456,out put
I tried using awk to trim each column but it didn't work.
我尝试使用 awk 来修剪每一列,但没有用。
回答by anubhava
This sed should work:
这个 sed 应该可以工作:
sed -i.bak -E 's/(^|,)[[:blank:]]+//g; s/[[:blank:]]+(,|$)//g' file.csv
This will remove leading spaes, trailing spaces and spaces around comma.
这将删除前导空格、尾随空格和逗号周围的空格。
Update:Here is an awk command to do the same:
更新:这是执行相同操作的 awk 命令:
awk -F '[[:blank:]]*,[[:blank:]]*' -v OFS=, '{
gsub(/^[[:blank:]]+|[[:blank:]]+$/, ""); =} 1' file
回答by sjsam
awk
is your friend.
awk
是你的朋友。
Input
输入
$ cat 38609590.txt
Ted Winter, Evelyn Salt, Peabody
Ulrich, Ethan Hunt, Wallace
James Bond, Q, M
(blank line)
Script
脚本
$ awk '/^$/{next}{sub(/^[[:blank:]]*/,"");gsub(/[[:blank:]]*,[[:blank:]]*/,",")}1' 38609590.txt
Output
输出
Ted Winter,Evelyn Salt,Peabody
Ulrich,Ethan Hunt,Wallace
James Bond,Q,M
Note
笔记
- This one removes the blank lines too -
/^$/{next}
. - See the [ awk ]manual for more information.
- 这个也删除了空行 -
/^$/{next}
。 - 有关更多信息,请参阅[awk]手册。
回答by Ed Morton
To remove leading blank chars with sed:
要使用 sed 删除前导空白字符:
$ sed -E 's/(^|,) +//g' file
123,ste hen,456,out put
With GNU awk:
使用 GNU awk:
$ awk '{$ awk '{sub(/^ +/,""); gsub(/, +/,",")}1' file
123,ste hen,456,out put
=gensub(/(^|,) +/,"\1","g")}1' file
123,ste hen,456,out put
With other awks:
与其他 awk:
$ sed -E 's/ *(^|,|$) *//g' file
123,ste hen,456,out put
To remove blank chars before and after the values with sed:
要使用 sed 删除值前后的空白字符:
$ awk '{$ awk '{gsub(/^ +| +$/,""); gsub(/ *, */,",")}1' file
123,ste hen,456,out put
=gensub(/ *(^|,|$) */,"\1","g")}1' file
123,ste hen,456,out put
With GNU awk:
使用 GNU awk:
echo " 123, ste hen, 456, out put" | awk '{sub(/^ +/,""); gsub(/, /,",")}1'
123,ste hen,456,out put
With other awks:
与其他 awk:
$ awk 'BEGIN{FS=OFS=","} {s = ""; for (i = 1; i <= NF; i++) gsub(/^[ \t]+/,"",$i);} 1' <<< "123, ste hen, 456, out put"
123,ste hen,456,out put
Change (a single blank char) to
[[:blank:]]
if you can have tabs as well as blank chars.
将(单个空白字符)更改为
[[:blank:]]
是否可以有制表符和空白字符。
回答by Claes Wikner
#!/bin/bash
# Output written to the file 'output.csv' in the same path
while IFS= read -r line || [[ -n "$line" ]]; do # Not setting IFS here, all done in 'awk', || condition for handling empty lines
awk 'BEGIN{FS=OFS=","} {s = ""; for (i = 1; i <= NF; i++) gsub(/^[ \t]+/,"",$i);} 1' <<< "$line" >> output.csv
done <input.csv
回答by Inian
Another way to do with awk
to remove multiple leading white-spaces is as below:-
awk
删除多个前导空格的另一种方法如下:-
$ cat > test.in
123, ste hen, 456, out put
$ awk -F',' -v OFS=',' '{for (i=1;i<=NF;i++) gsub(/^ +| +$/,"",$i); print BEGIN {
FS="," # set the input field separator
OFS="," # and the output field separator
}
{
for (i=1;i<=NF;i++) # loop thru every field on record
gsub(/^ +| +$/,"",$i) # remove leading and trailing spaces
print $ awk -f test.awk test.in
# print out the trimmed record
}
}' test.in
123,ste hen,456,out put
FS=OFS=","
sets the input and output field separator to,
s = ""; for (i = 1; i <= NF; i++)
loops across each column entry up to the end (i.e. from$1
,$2
...NF
) and thegsub(/^[ \t]+/,"",$i)
trims only the leading white-space and not anywhere else (one ore more white-space, note the+
) from each column.
FS=OFS=","
将输入和输出字段分隔符设置为,
s = ""; for (i = 1; i <= NF; i++)
遍历每个列条目直到末尾(即 from$1
,$2
...NF
),并且gsub(/^[ \t]+/,"",$i)
只修剪前导空白而不是其他任何地方(一个或多个空白,注意+
)。
If you are want to do this action for an entire file, suggest using a simple script like below
如果您想对整个文件执行此操作,建议使用如下所示的简单脚本
##代码##回答by James Brown
or written out loud:
或大声写出:
##代码##Run with:
运行:
##代码##回答by lolotux
You could try :
你可以试试:
- your file : ~/path/file.csv
- 你的文件:~/path/file.csv
cat ~/path/file.csv | tr -d "\ "
sed "s/, /,/g" ~/path/file.csv
cat ~/path/file.csv | tr -d "\ "
sed "s/, /,/g" ~/path/file.csv