在 bash/unix 中删除 CSV 文件所有列的空格

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/38609590/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 14:56:43  来源:igfitidea点击:

Removing spaces for all the columns of a CSV file in bash/unix

bashshellunixawksed

提问by stephenjacob

I have a CSV file in which every column contains unnecessary extra spaces added to it before the actual value. I want to create a new CSV file by removing all the spaces.

我有一个 CSV 文件,其中每一列都包含在实际值之前添加的不必要的额外空格。我想通过删除所有空格来创建一个新的 CSV 文件。

For example

例如

One line in input CSV file

输入 CSV 文件中的一行

 123, ste hen, 456, out put

Expected output CSV file

预期输出 CSV 文件

123,ste hen,456,out put

I tried using awk to trim each column but it didn't work.

我尝试使用 awk 来修剪每一列,但没有用。

回答by anubhava

This sed should work:

这个 sed 应该可以工作:

sed -i.bak -E 's/(^|,)[[:blank:]]+//g; s/[[:blank:]]+(,|$)//g' file.csv

This will remove leading spaes, trailing spaces and spaces around comma.

这将删除前导空格、尾随空格和逗号周围的空格。

Update:Here is an awk command to do the same:

更新:这是执行相同操作的 awk 命令:

awk -F '[[:blank:]]*,[[:blank:]]*' -v OFS=, '{
  gsub(/^[[:blank:]]+|[[:blank:]]+$/, ""); =} 1' file

回答by sjsam

awkis your friend.

awk是你的朋友。

Input

输入

$ cat 38609590.txt
Ted Winter, Evelyn Salt, Peabody
  Ulrich, Ethan Hunt, Wallace
James Bond, Q,  M
(blank line)

Script

脚本

$ awk '/^$/{next}{sub(/^[[:blank:]]*/,"");gsub(/[[:blank:]]*,[[:blank:]]*/,",")}1' 38609590.txt

Output

输出

Ted Winter,Evelyn Salt,Peabody
Ulrich,Ethan Hunt,Wallace
James Bond,Q,M

Note

笔记

  • This one removes the blank lines too - /^$/{next}.
  • See the [ awk ]manual for more information.
  • 这个也删除了空行 - /^$/{next}
  • 有关更多信息,请参阅[awk]手册。

回答by Ed Morton

To remove leading blank chars with sed:

要使用 sed 删除前导空白字符:

$ sed -E 's/(^|,) +//g' file
123,ste hen,456,out put

With GNU awk:

使用 GNU awk:

$ awk '{
$ awk '{sub(/^ +/,""); gsub(/, +/,",")}1' file
123,ste hen,456,out put
=gensub(/(^|,) +/,"\1","g")}1' file 123,ste hen,456,out put

With other awks:

与其他 awk:

$ sed -E 's/ *(^|,|$) *//g' file
123,ste hen,456,out put

To remove blank chars before and after the values with sed:

要使用 sed 删除值前后的空白字符:

$ awk '{
$ awk '{gsub(/^ +| +$/,""); gsub(/ *, */,",")}1' file
123,ste hen,456,out put
=gensub(/ *(^|,|$) */,"\1","g")}1' file 123,ste hen,456,out put

With GNU awk:

使用 GNU awk:

echo " 123, ste hen, 456, out put" | awk '{sub(/^ +/,""); gsub(/, /,",")}1'
123,ste hen,456,out put

With other awks:

与其他 awk:

$ awk 'BEGIN{FS=OFS=","} {s = ""; for (i = 1; i <= NF; i++) gsub(/^[ \t]+/,"",$i);} 1' <<< "123, ste hen, 456, out put"
123,ste hen,456,out put

Change (a single blank char) to [[:blank:]]if you can have tabs as well as blank chars.

(单个空白字符)更改为[[:blank:]]是否可以有制表符和空白字符。

回答by Claes Wikner

#!/bin/bash
# Output written to the file 'output.csv' in the same path    

while IFS= read -r line || [[ -n "$line" ]]; do   # Not setting IFS here, all done in 'awk', || condition for handling empty lines
   awk 'BEGIN{FS=OFS=","} {s = ""; for (i = 1; i <= NF; i++) gsub(/^[ \t]+/,"",$i);} 1' <<< "$line" >> output.csv
done <input.csv

回答by Inian

Another way to do with awkto remove multiple leading white-spaces is as below:-

awk删除多个前导空格的另一种方法如下:-

$ cat > test.in
 123, ste hen, 456, out put
$ awk -F',' -v OFS=',' '{for (i=1;i<=NF;i++) gsub(/^ +| +$/,"",$i); print 
BEGIN {
  FS=","                  # set the input field separator
  OFS=","                 # and the output field separator
}
{
  for (i=1;i<=NF;i++)     # loop thru every field on record
    gsub(/^ +| +$/,"",$i) # remove leading and trailing spaces
  print 
$ awk -f test.awk test.in
# print out the trimmed record }
}' test.in 123,ste hen,456,out put
  • FS=OFS=","sets the input and output field separator to ,
  • s = ""; for (i = 1; i <= NF; i++)loops across each column entry up to the end (i.e. from $1,$2...NF) and the gsub(/^[ \t]+/,"",$i)trims only the leading white-space and not anywhere else (one ore more white-space, note the +) from each column.
  • FS=OFS=","将输入和输出字段分隔符设置为 ,
  • s = ""; for (i = 1; i <= NF; i++)遍历每个列条目直到末尾(即 from $1, $2... NF),并且gsub(/^[ \t]+/,"",$i)只修剪前导空白而不是其他任何地方(一个或多个空白,注意+)。

If you are want to do this action for an entire file, suggest using a simple script like below

如果您想对整个文件执行此操作,建议使用如下所示的简单脚本

##代码##

回答by James Brown

##代码##

or written out loud:

或大声写出:

##代码##

Run with:

运行:

##代码##

回答by lolotux

You could try :

你可以试试:

  • your file : ~/path/file.csv
  • 你的文件:~/path/file.csv

cat ~/path/file.csv | tr -d "\ " sed "s/, /,/g" ~/path/file.csv

cat ~/path/file.csv | tr -d "\ " sed "s/, /,/g" ~/path/file.csv