bash 中的 cut 命令以引号结尾
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/14669760/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
cut command in bash terminating on quotation marks
提问by MZimmerman6
So I am trying to read in a file that has a bunch of lines with an email address and then a nickname in them. I am trying to extract this nickname, which is surrounded by parentheses, like below
所以我试图读取一个文件,该文件有一堆带有电子邮件地址的行,然后是其中的昵称。我正在尝试提取这个由括号括起来的昵称,如下所示
[email protected] (Tom)
so my thought was just to use cut to get at the word Tom, but this is foiled when I end up with something like the following
所以我的想法只是使用 cut 来获得这个词Tom,但是当我最终得到如下内容时,这被挫败了
[email protected] ("Bob")
Because Bob has quotes around it, the cut command fails as follows
因为 Bob 周围有引号,所以 cut 命令失败如下
cut: <file>: Illegal byte sequence
cut: <file>: Illegal byte sequence
Does anyone know of a better way of doing this? or a way to solve this problem?
有谁知道这样做的更好方法吗?或解决这个问题的方法?
采纳答案by Floris
I think that
我觉得
grep -o '(.*)' emailFile
should do it. "Go through all lines in the file. Look for a sequence that starts with open parens, then any characters until close parens. Echo the bit that matches the string to stdout."
应该这样做。“浏览文件中的所有行。查找以打开括号开头的序列,然后是任何字符,直到关闭括号。将与字符串匹配的位回显到标准输出。”
This preserves the quotes around the nickname... as well as the brackets. If you don't want those, you can strip them:
这保留了昵称周围的引号......以及括号。如果你不想要这些,你可以剥离它们:
grep -o '(.*)' emailFile | sed 's/[(")]//g'
("replace any of the characters between square brackets with nothing, everywhere")
(“将方括号之间的任何字符替换为空,无处不在”)
回答by kallos
Reset your localeto C(raw uninterpreted byte sequence) to avoid Illegal byte sequenceerrors.
将您重置locale为C(原始未解释的字节序列)以避免Illegal byte sequence错误。
locale charmap
LC_ALL=C cut ... | LC_ALL=C sort ...

