bash 使用 Sed Mac 终端查找和替换空格
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/18840175/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Find and Replace with Spaces using Sed Mac Terminal
提问by Leonna Sylvester
I have a .CSV file with over 500,000 lines that I need to:
我有一个超过 500,000 行的 .CSV 文件,我需要:
- find all 'space double quote space' sequences and replace with nothing
- find all 'space double quote' sequences and replace with nothing
- find all double quotes and replace with nothing
- 找到所有“空格双引号空格”序列并用空替换
- 找到所有“空格双引号”序列并用空替换
- 查找所有双引号并用空替换
Example of .CSV line:
.CSV 行示例:
"DISH Hartford & New Haven (Hartford)", "206", "FBNHD", " 06028", " East Windsor Hill", "CT", "Hartford County"
** Required output**
** 所需输出**
DISH Hartford & New Haven (Hartford),206,FBNHD,06028,East Windsor Hill,CT,Hartford County
I need to remove all double quotes ("
) and spaces in front of and behind the commas (,
).
我需要删除所有双引号 ( "
) 和逗号 ( ,
)前后的空格。
I've tried
我试过了
$ cd /Users/Leonna/Downloads/
$ cat bs-B2Bformat.csv | sed s/ " //g
This gives me the 'command incomplete' greater than prompt, so I then tried:
这给了我比提示更大的“命令不完整”,所以我尝试了:
$ cat bs-B2Bformat.csv | sed s/ " //g
sed: 1: "s/": unterminated substitute pattern
$ cat bs-B2Bformat.csv |sed s/ \" //g
sed: 1: "s/": unterminated substitute pattern
$
There are too many lines for me to edit in Excel (Excel won't load all the lines) or even a text editor. How can I fix this?
我需要在 Excel 中编辑太多行(Excel 不会加载所有行)甚至文本编辑器。我怎样才能解决这个问题?
回答by brunocodutra
Quoted from here:
从这里引用:
For POSIX compliance, use the character class [[:space:]] instead of \s, since the latter is a GNU sed extension.
为了符合 POSIX,请使用字符类 [[:space:]] 而不是 \s,因为后者是 GNU sed 扩展。
Based on that, I would suggest the following, which, as Jonathan Lefflerpointed out, is portable across GNU and BSD implementations.
基于此,我建议以下内容,正如Jonathan Leffler指出的那样,它可以跨 GNU 和 BSD 实现移植。
sed -E 's/[[:space:]]?"[[:space:]]?//g' <path/to/file>
The -E
flag enables extended regular expressionson BSD implementations. On GNU sed
it is undocumented, but as discussed here, it enables compatibility with the BSD standard.
该-E
标志在 BSD 实现上启用扩展正则表达式。在 GNU 上sed
它是未记录的,但正如这里所讨论的,它能够与 BSD 标准兼容。
Quoted from the manual for BSD sed
:
引用自BSD 手册sed
:
-E Interpret regular expressions as extended (modern) regular expressions rather than basic regular expressions (BRE's).
-E 将正则表达式解释为扩展(现代)正则表达式而不是基本正则表达式 (BRE)。
Applying the above command on a file containing the following single line
在包含以下单行的文件上应用上述命令
"DISH Hartford & New Haven (Hartford)", "206", "FBNHD", " 06028", " East Windsor Hill", "CT", "Hartford County"
“DISH Hartford & New Haven (Hartford)”、“206”、“FBNHD”、“06028”、“East Windsor Hill”、“CT”、“Hartford County”
it yields
它产生
DISH Hartford & New Haven (Hartford),206,FBNHD,06028,East Windsor Hill,CT,Hartford County
DISH Hartford & New Haven (Hartford),206,FBNHD,06028,East Windsor Hill,CT,Hartford County
回答by Shylo Hana
This should do it:
这应该这样做:
sed -i 's/\(\s\|\)"\(\|\s\)//g' bs-B2Bformat.csv
回答by iamauser
This works for me. Is this what you want ?
这对我有用。这是你想要的吗 ?
sed -e 's|", "|,|g' -e 's|^"||g' -e 's|"$||g' file.csv
echo '"DISH Hartford & New Haven (Hartford)", "206", "FBNHD", " 06028", " East Windsor Hill", "CT", "Hartford County"' | sed -e 's|", "|,|g' -e 's|^"||g' -e 's|"$||g'
DISH Hartford & New Haven (Hartford),206,FBNHD, 06028, East Windsor Hill,CT,Hartford County
回答by Birei
One way is to use pythonand its csv
module:
一种方法是使用python及其csv
模块:
import csv
import sys
## Open file provided as argument.
with open(sys.argv[1], 'r') as f:
## Create the csv reader and writer. Avoid to quote fields in output.
reader = csv.reader(f, skipinitialspace=True)
writer = csv.writer(sys.stdout, quoting=csv.QUOTE_NONE, escapechar='\')
## Read file line by line, remove leading and trailing white spaces and
## print.
for row in reader:
row = [field.strip() for field in row]
writer.writerow(row)
Run it like:
像这样运行它:
python3 script.py csvfile
That yields:
这产生:
DISH Hartford & New Haven (Hartford),206,FBNHD,06028,East Windsor Hill,CT,Hartford County
回答by Nashenas
What all of the current answers seemed to miss:
当前所有的答案似乎都遗漏了什么:
$ cat bs-B2Bformat.csv | sed s/ " //g sed: 1: "s/": unterminated substitute pattern $ cat bs-B2Bformat.csv |sed s/ \" //g sed: 1: "s/": unterminated substitute pattern $
$ cat bs-B2Bformat.csv | sed s/ " //g sed: 1: "s/": unterminated substitute pattern $ cat bs-B2Bformat.csv |sed s/ \" //g sed: 1: "s/": unterminated substitute pattern $
The problem in the above is missing single quotes. It should have been:
上面的问题是缺少单引号。本来应该是:
$ cat bs-B2Bformat.csv | sed 's/ " //g'
^ ^
Without the single quotes, bash splits at the spaces and sends three separate arguments (well at least for the case of \"
). sed was seeing its first argument as just s/
.
如果没有单引号,bash 在空格处拆分并发送三个单独的参数(至少对于 的情况是这样 \"
)。sed 将其第一个参数视为 just s/
。
Edit: FYI, single quotes are not required, they just make this case easier. If you want to use double quotes, just escape the one you want to keep for matching:
编辑:仅供参考,不需要单引号,它们只是使这种情况更容易。如果你想使用双引号,只需转义你想保留的匹配:
$ cat bs-B2Bformat.csv | sed "s/ \" //g"