这是 C++11 正则表达式错误我还是编译器?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/8060025/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-28 17:56:38  来源:igfitidea点击:

Is this C++11 regex error me or the compiler?

c++regexgccc++11

提问by Shay Guy

OK, this isn't the original program I had this problem in, but I duplicated it in a much smaller one. Very simple problem.

好的,这不是我遇到此问题的原始程序,但我将其复制到一个小得多的程序中。很简单的问题。

main.cpp:

主.cpp:

#include <iostream>
#include <regex>
using namespace std;

int main()
{
    regex r1("S");
    printf("S works.\n");
    regex r2(".");
    printf(". works.\n");
    regex r3(".+");
    printf(".+ works.\n");
    regex r4("[0-9]");
    printf("[0-9] works.\n");
    return 0;
}

Compiled successfully with this command, no error messages:

用这个命令编译成功,没有错误信息:

$ g++ -std=c++0x main.cpp

The last line of g++ -v, by the way, is:

g++ -v顺便说一下,最后一行是:

gcc version 4.6.1 (Ubuntu/Linaro 4.6.1-9ubuntu3)

And the result when I try to run it:

当我尝试运行它时的结果:

$ ./a.out 
S works.
. works.
.+ works.
terminate called after throwing an instance of 'std::regex_error'
  what():  regex_error
Aborted

It happens the same way if I change r4 to \\s, \\w, or [a-z]. Is this a problem with the compiler? I might be able to believe that C++11's regex engine has different ways of saying "whitespace" or "word character," but square brackets not working is a stretch. Is it something that's been fixed in 4.6.2?

如果我将 r4 更改为\\s, \\w, 或,也会发生同样的情况[a-z]。这是编译器的问题吗?我可能会相信 C++11 的正则表达式引擎有不同的说法“空白”或“单词字符”,但方括号不起作用是一种延伸。它是否已在 4.6.2 中修复?

EDIT:

编辑:

Joachim Pileborg has supplied a partial solution, using an extra regex_constantsparameter to enable a syntax that supports square brackets, but neither basic, extended, awk, nor ECMAScriptseem to support backslash-escaped terms like \\s, \\w, or \\t.

约阿希姆Pileborg已提供的部分解决方案,使用额外的regex_constants参数,以使得语法可支持方括号,但既不basicextendedawk,也不ECMAScript似乎支持反斜杠转义术语如\\s\\w,或\\t

EDIT 2:

编辑2:

Using raw strings (R"(\w)"instead of "\\w") doesn't seem to work either.

使用原始字符串(R"(\w)"而不是"\\w")似乎也不起作用。

采纳答案by jfs

Update: <regex>is now implemented and released in GCC 4.9.0

更新:<regex>现在在 GCC 4.9.0 中实现和发布



Old answer:

旧答案:

ECMAScript syntax accepts [0-9], \s, \w, etc, see ECMA-262 (15.10). Here's an example with boost::regexthat also uses the ECMAScript syntax by default:

ECMAScript 语法接受[0-9]\s\w等,请参阅ECMA-262 (15.10)。这是一个boost::regex默认情况下也使用 ECMAScript 语法的示例:

#include <boost/regex.hpp>

int main(int argc, char* argv[]) {
  using namespace boost;
  regex e("[0-9]");
  return argc > 1 ? !regex_match(argv[1], e) : 2;
}

It works:

有用:

$ g++ -std=c++0x *.cc -lboost_regex && ./a.out 1

According to the C++11 standard (28.8.2) basic_regex()uses regex_constants::ECMAScriptflag by default so it must understand this syntax.

根据 C++11 标准 (28.8.2)默认basic_regex()使用regex_constants::ECMAScript标志,因此它必须理解此语法。

Is this C++11 regex error me or the compiler?

这是 C++11 正则表达式错误我还是编译器?

gcc-4.6.1 doesn't support c++11 regular expressions (28.13).

gcc-4.6.1 不支持 c++11 正则表达式 (28.13)

回答by Some programmer dude

The error is because creating a regex by default uses ECMAScript syntax for the expression, which doesn't support brackets. You should declare the expression with the basicor extendedflag:

错误是因为默认情况下创建正则表达式使用 ECMAScript 语法作为表达式,它不支持括号。您应该使用basicorextended标志声明表达式:

std::regex r4("[0-9]", std::regex_constants::basic);

EditSeems like libstdc++ (part of GCC, and the library that handles all C++ stuff) doesn't fully implement regular expressions yet. In their status documentthey say that Modified ECMAScript regular expression grammaris not implemented yet.

编辑似乎 libstdc++(GCC 的一部分,以及处理所有 C++ 内容的库)还没有完全实现正则表达式。在他们的状态文档中,他们说Modified ECMAScript 正则表达式语法尚未实现。

回答by Drew Noakes

Regex support improved between gcc 4.8.2 and 4.9.2. For example, the regex =[A-Z]{3}was failing for me with:

在 gcc 4.8.2 和 4.9.2 之间改进了正则表达式支持。例如,正则表达式=[A-Z]{3}对我来说失败了:

Regex error

正则表达式错误

After upgrading to gcc 4.9.2, it works as expected.

升级到 gcc 4.9.2 后,它按预期工作。