剥离非字母数字字符但在 Ruby 中留下空格

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/10073332/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-06 04:59:28  来源:igfitidea点击:

Stripping non-alphanumeric chars but leaving spaces in Ruby

rubystring

提问by RailsTweeter

Trying to change this:

试图改变这一点:

"The basketball-player is great! (Kobe Bryant)"

into this:

进入这个:

"the basketball player is great kobe bryant"

Want to downcase and remove all punctuation but leave spaces...

想要小写并删除所有标点符号但留下空格...

Tried string.downcase.gsub(/[^a-z ]/, '')but it removes the spaces

尝试过,string.downcase.gsub(/[^a-z ]/, '')但它删除了空格

回答by gmalette

You can simply add \s(whitespace)

您可以简单地添加\s(空格)

string.downcase.gsub(/[^a-z0-9\s]/i, '')

string.downcase.gsub(/[^a-z0-9\s]/i, '')

回答by fl00r

If you want to catch non-latin characters, too:

如果您也想捕获非拉丁字符:

str = "The basketball-player is great! (Kobe Bryant) (ひらがな)"
str.downcase.gsub(/[^[:word:]\s]/, '')
#=> "the basketballplayer is great kobe bryant ひらがな"

回答by pguardiario

Some fine solutions, but simplest is usually best:

一些很好的解决方案,但最简单的通常是最好的:

string.downcase.gsub /\W+/, ' '

回答by DrewB

All the other answers strip out numbers as well. That works for the example given but doesn't really answer the question which is how to strip out non-alphanumeric.

所有其他答案也删除了数字。这适用于给出的示例,但并没有真正回答如何去除非字母数字的问题

string.downcase.gsub(/[^\w\s]/, '')

Note this will not strip out underscores. If you need that then:

请注意,这不会去掉下划线。如果你需要,那么:

string.downcase.gsub(/[^a-zA-Z\s\d]/, '')

回答by Ivaylo Strandjev

a.downcase.gsub(/[^a-z ]/, "")

a.downcase.gsub(/[^a-z ]/, "")

Note the whitespace I have added after a-z. Also if you want to replace all whitespaces(not only space use \s as proposed by gmalette).

请注意我在 az 之后添加的空格。此外,如果您想替换所有空格(不仅仅是空格,请使用 gmalette 建议的 \s)。

回答by Renaud Humbert-Labeaumaz

All the previous answers make basketball-player into basketballplayer or remove numbers entirely, which is not exactly what is required.

之前的所有答案都使篮球运动员成为篮球运动员或完全删除数字,这并不完全是必需的。

The following code does exactly what you asked:

以下代码完全符合您的要求:

text.downcase
    .gsub(/[^[:word:]\s]/, ' ') # Replace sequences of non-alphanumerical chars by a single space

Hope this helps someone!

希望这可以帮助某人!