bash 使用 Sed Mac 终端查找和替换空格

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/18840175/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 06:35:33  来源:igfitidea点击:

Find and Replace with Spaces using Sed Mac Terminal

macosbashcsvsedterminal

提问by Leonna Sylvester

I have a .CSV file with over 500,000 lines that I need to:

我有一个超过 500,000 行的 .CSV 文件,我需要:

  1. find all 'space double quote space' sequences and replace with nothing
  2. find all 'space double quote' sequences and replace with nothing
  3. find all double quotes and replace with nothing
  1. 找到所有“空格双引号空格”序列并用空替换
  2. 找到所有“空格双引号”序列并用空替换
  3. 查找所有双引号并用空替换

Example of .CSV line:

.CSV 行示例:

"DISH Hartford & New Haven  (Hartford)", "206", "FBNHD", " 06028", " East Windsor Hill", "CT", "Hartford County"

** Required output**

** 所需输出**

DISH Hartford & New Haven  (Hartford),206,FBNHD,06028,East Windsor Hill,CT,Hartford County

I need to remove all double quotes (") and spaces in front of and behind the commas (,).

我需要删除所有双引号 ( ") 和逗号 ( ,)前后的空格。

I've tried

我试过了

$ cd /Users/Leonna/Downloads/
$ cat bs-B2Bformat.csv | sed s/ " //g

This gives me the 'command incomplete' greater than prompt, so I then tried:

这给了我比提示更大的“命令不完整”,所以我尝试了:

$ cat bs-B2Bformat.csv | sed s/ " //g
sed: 1: "s/": unterminated substitute pattern
$ cat bs-B2Bformat.csv |sed s/ \" //g
sed: 1: "s/": unterminated substitute pattern
$

There are too many lines for me to edit in Excel (Excel won't load all the lines) or even a text editor. How can I fix this?

我需要在 Excel 中编辑太多行(Excel 不会加载所有行)甚至文本编辑器。我怎样才能解决这个问题?

回答by brunocodutra

Quoted from here:

这里引用:

For POSIX compliance, use the character class [[:space:]] instead of \s, since the latter is a GNU sed extension.

为了符合 POSIX,请使用字符类 [[:space:]] 而不是 \s,因为后者是 GNU sed 扩展。

Based on that, I would suggest the following, which, as Jonathan Lefflerpointed out, is portable across GNU and BSD implementations.

基于此,我建议以下内容,正如Jonathan Leffler指出的那样,它可以跨 GNU 和 BSD 实现移植。

sed -E 's/[[:space:]]?"[[:space:]]?//g' <path/to/file>

The -Eflag enables extended regular expressionson BSD implementations. On GNU sedit is undocumented, but as discussed here, it enables compatibility with the BSD standard.

-E标志在 BSD 实现上启用扩展正则表达式。在 GNU 上sed它是未记录的,但正如这里所讨论的,它能够与 BSD 标准兼容。

Quoted from the manual for BSD sed:

引用自BSD 手册sed

-E Interpret regular expressions as extended (modern) regular expressions rather than basic regular expressions (BRE's).

-E 将正则表达式解释为扩展(现代)正则表达式而不是基本正则表达式 (BRE)。

Applying the above command on a file containing the following single line

在包含以下单行的文件上应用上述命令

"DISH Hartford & New Haven (Hartford)", "206", "FBNHD", " 06028", " East Windsor Hill", "CT", "Hartford County"

“DISH Hartford & New Haven (Hartford)”、“206”、“FBNHD”、“06028”、“East Windsor Hill”、“CT”、“Hartford County”

it yields

它产生

DISH Hartford & New Haven (Hartford),206,FBNHD,06028,East Windsor Hill,CT,Hartford County

DISH Hartford & New Haven (Hartford),206,FBNHD,06028,East Windsor Hill,CT,Hartford County

回答by Shylo Hana

This should do it:

这应该这样做:

sed -i 's/\(\s\|\)"\(\|\s\)//g' bs-B2Bformat.csv

回答by iamauser

This works for me. Is this what you want ?

这对我有用。这是你想要的吗 ?

 sed -e 's|", "|,|g' -e 's|^"||g' -e 's|"$||g' file.csv

 echo '"DISH Hartford & New Haven (Hartford)", "206", "FBNHD", " 06028", " East Windsor Hill", "CT", "Hartford County"' | sed -e 's|", "|,|g' -e 's|^"||g' -e 's|"$||g'

 DISH Hartford & New Haven (Hartford),206,FBNHD, 06028, East Windsor Hill,CT,Hartford County

回答by Birei

One way is to use pythonand its csvmodule:

一种方法是使用python及其csv模块:

import csv 
import sys 

## Open file provided as argument.
with open(sys.argv[1], 'r') as f:

    ## Create the csv reader and writer. Avoid to quote fields in output.
    reader = csv.reader(f, skipinitialspace=True)
    writer = csv.writer(sys.stdout, quoting=csv.QUOTE_NONE, escapechar='\')

    ## Read file line by line, remove leading and trailing white spaces and
    ## print.
    for row in reader:
        row = [field.strip() for field in row]
        writer.writerow(row)

Run it like:

像这样运行它:

python3 script.py csvfile

That yields:

这产生:

DISH Hartford & New Haven  (Hartford),206,FBNHD,06028,East Windsor Hill,CT,Hartford County

回答by Nashenas

What all of the current answers seemed to miss:

当前所有的答案似乎都遗漏了什么:

$ cat bs-B2Bformat.csv | sed s/ " //g
sed: 1: "s/": unterminated substitute pattern
$ cat bs-B2Bformat.csv |sed s/ \" //g
sed: 1: "s/": unterminated substitute pattern
$
$ cat bs-B2Bformat.csv | sed s/ " //g
sed: 1: "s/": unterminated substitute pattern
$ cat bs-B2Bformat.csv |sed s/ \" //g
sed: 1: "s/": unterminated substitute pattern
$

The problem in the above is missing single quotes. It should have been:

上面的问题是缺少单引号。本来应该是:

$ cat bs-B2Bformat.csv | sed 's/ " //g'
                             ^        ^

Without the single quotes, bash splits at the spaces and sends three separate arguments (well at least for the case of \"). sed was seeing its first argument as just s/.

如果没有单引号,bash 在空格处拆分并发送三个单独的参数(至少对于 的情况是这样 \")。sed 将其第一个参数视为 just s/

Edit: FYI, single quotes are not required, they just make this case easier. If you want to use double quotes, just escape the one you want to keep for matching:

编辑:仅供参考,不需要单引号,它们只是使这种情况更容易。如果你想使用双引号,只需转义你想保留的匹配:

$ cat bs-B2Bformat.csv | sed "s/ \" //g"