Ruby 正则表达式错误:不兼容的编码正则表达式匹配(ASCII-8BIT 正则表达式与 UTF-8 字符串)

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/9857443/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-03 03:06:20  来源:igfitidea点击:

Ruby Regex Error: incompatible encoding regexp match (ASCII-8BIT regexp with UTF-8 string)

ruby-on-railsrubyunicodeencodingcharacter-encoding

提问by Shpigford

I'm getting two errors, both revolving around encoding and both related.

我收到两个错误,都与编码有关,而且都相关。

The first error (technically, a warning) I get when starting up WEBrick:

启动 WEBrick 时出现的第一个错误(技术上是警告):

/Users/USERNAME/example/config/initializers/bb-ruby.rb:54: warning: invalid Unicode Property \P: /\:\-?\P/

The line it's referring to is: /\:\-?\P/,

它所指的行是: /\:\-?\P/,

It's just a bit of regex, ultimately part of this block:

这只是一点正则表达式,最终是这个块的一部分:

@@tags['Razzing'] = [
  /\:\-?\P/,
  '<img src="/assets/emoticons/razzing.png">',
  'Razzing',
  ':P',
  :razzing]

Then, I also get the following error when parsing some strings (presumably due to this same line)...

然后,我在解析某些字符串时也会收到以下错误(大概是由于同一行)...

Encoding::CompatibilityError
incompatible encoding regexp match (ASCII-8BIT regexp with UTF-8 string)

I'm running Ruby 1.9.2 and Rails 3.2.1.

我正在运行 Ruby 1.9.2 和 Rails 3.2.1。

回答by Fábio Batista

Your Regex is being "compiled" as ASCII-8BIT.

您的正则表达式正在“编译”为 ASCII-8BIT。

Just add the encoding declaration at the top of the file where the Regex is declared:

只需在声明 Regex 的文件顶部添加编码声明:

# encoding: utf-8

And you're done. Now, when Ruby is parsing your code, it will assume every literal you use (Regex, String, etc) is specified in UTF-8 encoding.

你已经完成了。现在,当 Ruby 解析您的代码时,它会假设您使用的每个文字(正则表达式、字符串等)都是以 UTF-8 编码指定的。

UPDATE:UTF-8is now the default encoding for Ruby 2.0 and beyond.

更新:UTF-8现在是 Ruby 2.0 及更高版本的默认编码。

回答by Nil Liu

Ruby 2.0 Document

Ruby 2.0 文档

/Pattern/u - stand for UTF-8