bash 使用 sed 删除非字母数字字符
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/20007288/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Removing non-alphanumeric characters with sed
提问by gorideyourbike
I am trying to validate some inputs to remove a set of characters. Only alphanumeric characters plus, period, underscore, hyphen are allowed. I've tested the regex expression [^\w.-]
here http://gskinner.com/RegExr/and it matches what I want removed so I not sure why sed
is returning the opposite. What am I missing?
我正在尝试验证一些输入以删除一组字符。只允许使用字母数字字符加、句点、下划线、连字符。我已经在[^\w.-]
这里http://gskinner.com/RegExr/测试了正则表达式,它匹配我想要删除的内容,所以我不确定为什么sed
返回相反的内容。我错过了什么?
My end goal is to input "?10.41.89.50 "
and get "10.41.89.50
".
我的最终目标是输入"?10.41.89.50 "
并获得"10.41.89.50
“.
I've tried:
我试过了:
echo "?10.41.89.50 " | sed s/[^\w.-]//g
returns ?...
echo "?10.41.89.50 " | sed s/[^\w.-]//g
返回 ?...
echo "?10.41.89.50 " | sed s/[\w.-]//g
and echo "?10.41.89.50 " | sed s/[\w^.-]//g
returns ?10418950
echo "?10.41.89.50 " | sed s/[\w.-]//g
并echo "?10.41.89.50 " | sed s/[\w^.-]//g
返回?10418950
I attempted the answer found here Skip/remove non-ascii character with sedbut nothing was removed.
我尝试了此处找到的答案,使用 sed 跳过/删除非 ascii 字符,但没有删除任何内容。
回答by iruvar
回答by gniourf_gniourf
You might want to use the [:alpha:]
class instead:
您可能想改用[:alpha:]
该类:
echo "?10.41.89.50 " | sed "s/[[:alpha:].-]//g"
should work. If not, you might need to change your local settings.
应该管用。如果没有,您可能需要更改本地设置。
On the other hand, if you only want to keep the digits, the hyphens and the period::
另一方面,如果您只想保留数字、连字符和句点:
echo "?10.41.89.50 " | sed "s/[^[:digit:].-]//g"
If your string is in a variable, you can use pure bash and parameter expansionsfor that:
如果您的字符串在变量中,您可以使用纯 bash 和参数扩展:
$ dirty="?10.41.89.50 "
$ clean=${dirty//[^[:digit:].-]/}
$ echo "$clean"
10.41.89.50
or
或者
$ dirty="?10.41.89.50 "
$ clean=${dirty//[[:alpha:]]/}
$ echo "$clean"
10.41.89.50
You can also have a look at 1_CR
's answer.
你也可以看看1_CR
的回答。
回答by anubhava
Well sed won't support unicode characters. Use perl
instead:
那么 sed 将不支持 unicode 字符。使用perl
来代替:
> s="?10.41.89.50 "
> perl -pe 's/[^\w.-]+//g' <<< "$s"
10.41.89.50
回答by panticz
To remove all characters except of alphanumeric and "-" use this code:
要删除除字母数字和“-”之外的所有字符,请使用以下代码:
echo "a b-1_2" | sed "s/[^[:alnum:]-]//g"
回答by technerdius
<`[[:alnum:]_.@]`
This worked just fine for me. It preserved all of the characters I specified for my purposes.
这对我来说很好。它保留了我为我的目的指定的所有字符。
回答by Iwan Plays
Based on anubhava's answer, this one worked for me:
根据 anubhava 的回答,这个对我有用:
s/^[[:alnum:]]//g
Replaced anything other than alphanumeric with single space.
用单个空格替换字母数字以外的任何内容。
Note: "." characters get preserved
笔记: ”。” 字符得到保留