Ruby-on-rails 如何删除特殊字符?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/737475/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-02 21:07:04  来源:igfitidea点击:

How can I delete special characters?

ruby-on-railsrubyregex

提问by Yud

I'm practicing with Ruby and regex to delete certain unwanted characters. For example:

我正在练习使用 Ruby 和正则表达式来删除某些不需要的字符。例如:

input = input.gsub(/<\/?[^>]*>/, '')

and for special characters, example ☻ or ™:

对于特殊字符,例如 ☻ 或 ™:

input = input.gsub('&#', '')

This leaves only numbers, ok. But this only works if the user enters a special character as a code, like this:

这只留下数字,好的。但这仅在用户输入特殊字符作为代码时才有效,如下所示:

&#153;

My question:How I can delete special characters if the user enters a special character without code, like this:

我的问题:如果用户输入没有代码的特殊字符,如何删除特殊字符,如下所示:

? ?

回答by Can Berk Güder

First of all, I think it might be easier to define what constitutes "correct input" and remove everything else. For example:

首先,我认为定义什么构成“正确输入”并删除其他所有内容可能更容易。例如:

input = input.gsub(/[^0-9A-Za-z]/, '')

If that's not what you want (you want to support non-latin alphabets, etc.), then I think you should make a list of the glyphs you want to remove (like ? or ?), and remove them one-by-one, since it's hard to distinguish between a Chinese, Arabic, etc. character and a pictograph programmatically.

如果这不是您想要的(您想支持非拉丁字母等),那么我认为您应该列出要删除的字形(如 ? 或 ?),然后将它们一一删除,因为很难以编程方式区分中文、阿拉伯语等字符和象形文字。

Finally, you might want to normalize your input by converting to or from HTML escape sequences.

最后,您可能希望通过转换为 HTML 转义序列或从 HTML 转义序列来规范化您的输入。

回答by Matthew Schinckel

If you just wanted ASCII characters, then you can use:

如果您只想要 ASCII 字符,那么您可以使用:

original = "a?bauhrhr?oeuac?" 
cleaned = ""
original.each_byte { |x|  cleaned << x unless x > 127   }
cleaned   # => "abauhrhroeuac"

回答by sts

You can use parameterize:

您可以使用参数化

'@!#$%^&*()111'.parameterize
 => "111" 

回答by Magnar

You can match all the characters you want, and then join them together, like this:

您可以匹配您想要的所有字符,然后将它们连接在一起,如下所示:

original = "a?b?c?"
stripped = original.scan(/[a-zA-Z]/).to_s
puts stripped

which outputs "abc"

哪个输出 "abc"

回答by Marco

An easier way to do this inspirated by Can Berk Güder answer is:

受 Can Berk Güder 的启发,一种更简单的方法是:

In order to delete special characters:

为了删除特殊字符:

input = input.gsub(/\W/, '')

In order to keep word characters:

为了保持单词字符:

input = input.scan(/\w/)

At the end input is the same! Try it on : http://rubular.com/

最后输入是一样的!试试看:http: //rubular.com/