从 UTF-8 国际字符中删除重音符号的 Ruby 方法
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/15686752/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Ruby method to remove accents from UTF-8 international characters
提问by Gus Shortz
I am trying to create a 'normalized' copy of a string, to help reduce duplicate names in a database. The names contain many international characters (ie. accented letters), and I want to create a copy with the accents removed.
我正在尝试创建字符串的“规范化”副本,以帮助减少数据库中的重复名称。名称包含许多国际字符(即重音字母),我想创建一个删除重音的副本。
I did come across the method below, but cannot get it to work. I can't seem to find what the Unicode Hacks plugin is.
我确实遇到了下面的方法,但无法让它发挥作用。我似乎找不到 Unicode Hacks 插件是什么。
# Utility method that retursn an ASCIIfied, downcased, and sanitized string.
# It relies on the Unicode Hacks plugin by means of String#chars. We assume
# $KCODE is 'u' in environment.rb. By now we support a wide range of latin
# accented letters, based on the Unicode Character Palette bundled inMacs.
def self.normalize(str)
n = str.chars.downcase.strip.to_s
n.gsub!(/[? ???¢?£?¤?¥???]/u, 'a')
n.gsub!(/?|/u, 'ae')
n.gsub!(/[???]/u, 'd')
n.gsub!(/[?§???????]/u, 'c')
n.gsub!(/[?¨???a????????????]/u, 'e')
n.gsub!(/??/u, 'f')
n.gsub!(/[??????£]/u, 'g')
n.gsub!(/[?¥?§]/, 'h')
n.gsub!(/[?????-???ˉ?????-]/u, 'i')
n.gsub!(/[?ˉ?±?3?μ]/u, 'j')
n.gsub!(/[?·??]/u, 'k')
n.gsub!(/[?????o????]/u, 'l')
n.gsub!(/[?±??????????]/u, 'n')
n.gsub!(/[?2?3?′?μ?????????]/u, 'o')
n.gsub!(/??/u, 'oe')
n.gsub!(/??/u, 'q')
n.gsub!(/[??????]/u, 'r')
n.gsub!(/[???????è?]/u, 's')
n.gsub!(/[?¥?£?§è?]/u, 't')
n.gsub!(/[?1?o???????ˉ?±?-???3]/u,'u')
n.gsub!(/?μ/u, 'w')
n.gsub!(/[?????·]/u, 'y')
n.gsub!(/[?????o]/u, 'z')
n.gsub!(/\s+/, ' ')
n.gsub!(/[^\sa-z0-9_-]/, '')
n
end
Do I need to 'require' a particular library/gem? Or maybe someone could recommend another way to go about this.
我需要“要求”一个特定的图书馆/宝石吗?或者也许有人可以推荐另一种方法来解决这个问题。
I am not using Rails, nor do I plan on doing so.
我没有使用 Rails,也不打算这样做。
回答by user2398029
I generally use I18n to handle this:
我一般使用 I18n 来处理这个:
1.9.3p392 :001 > require "i18n"
=> true
1.9.3p392 :002 > I18n.transliterate("Hé les mecs!")
=> "He les mecs!"
回答by Gus Shortz
So far the following is the only way I've been able to accomplish what I need:
到目前为止,以下是我能够完成我需要的唯一方法:
str.tr(
"àá????àáa???āā??????????????De????èéê?èéê?ēē??????ěě????????????ìí??ìí????īī????????????????????????ń???ň???òó????òó????ōō?????????????????????????ùú?üùú?ü??ūū??????????Yy??????????",
"AAAAAAaaaaaaAaAaAaCcCcCcCcCcDdDdDdEEEEeeeeEeEeEeEeEeGgGgGgGgHhHhIIIIiiiiIiIiIiIiIiJjKkkLlLlLlLlLlNnNnNnNnnNnOOOOOOooooooOoOoOoRrRrRrSsSsSsSssTtTtTtUUUUuuuuUuUuUuUuUuUuWwYyyYyYZzZzZz")
But using this feels very 'hackish', and I would love to find a better way.
但是使用它感觉非常“hackish”,我很想找到更好的方法。
回答by AlexGuti
The parameterizemethod could be a nice and simple solution to remove special characters in order to use the string as human readable identifier:
所述参数化的方法可以是一个很好的和简单的解决方案,以便使用字符串作为人类可读的标识符,以除去特殊字符:
> "Fran?oise Isa?e".parameterize
=> "francoise-isaie"
回答by Naved Khan
If you are using rails,
如果您使用的是导轨,
my_string = "L'Oréal"
my_string.parameterize(separator=' ')

