这是 C++11 正则表达式错误我还是编译器?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/8060025/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Is this C++11 regex error me or the compiler?
提问by Shay Guy
OK, this isn't the original program I had this problem in, but I duplicated it in a much smaller one. Very simple problem.
好的,这不是我遇到此问题的原始程序,但我将其复制到一个小得多的程序中。很简单的问题。
main.cpp:
主.cpp:
#include <iostream>
#include <regex>
using namespace std;
int main()
{
regex r1("S");
printf("S works.\n");
regex r2(".");
printf(". works.\n");
regex r3(".+");
printf(".+ works.\n");
regex r4("[0-9]");
printf("[0-9] works.\n");
return 0;
}
Compiled successfully with this command, no error messages:
用这个命令编译成功,没有错误信息:
$ g++ -std=c++0x main.cpp
The last line of g++ -v
, by the way, is:
g++ -v
顺便说一下,最后一行是:
gcc version 4.6.1 (Ubuntu/Linaro 4.6.1-9ubuntu3)
And the result when I try to run it:
当我尝试运行它时的结果:
$ ./a.out
S works.
. works.
.+ works.
terminate called after throwing an instance of 'std::regex_error'
what(): regex_error
Aborted
It happens the same way if I change r4 to \\s
, \\w
, or [a-z]
. Is this a problem with the compiler? I might be able to believe that C++11's regex engine has different ways of saying "whitespace" or "word character," but square brackets not working is a stretch. Is it something that's been fixed in 4.6.2?
如果我将 r4 更改为\\s
, \\w
, 或,也会发生同样的情况[a-z]
。这是编译器的问题吗?我可能会相信 C++11 的正则表达式引擎有不同的说法“空白”或“单词字符”,但方括号不起作用是一种延伸。它是否已在 4.6.2 中修复?
EDIT:
编辑:
Joachim Pileborg has supplied a partial solution, using an extra regex_constants
parameter to enable a syntax that supports square brackets, but neither basic
, extended
, awk
, nor ECMAScript
seem to support backslash-escaped terms like \\s
, \\w
, or \\t
.
约阿希姆Pileborg已提供的部分解决方案,使用额外的regex_constants
参数,以使得语法可支持方括号,但既不basic
,extended
,awk
,也不ECMAScript
似乎支持反斜杠转义术语如\\s
,\\w
,或\\t
。
EDIT 2:
编辑2:
Using raw strings (R"(\w)"
instead of "\\w"
) doesn't seem to work either.
使用原始字符串(R"(\w)"
而不是"\\w"
)似乎也不起作用。
采纳答案by jfs
Update: <regex>
is now implemented and released in GCC 4.9.0
更新:<regex>
现在在 GCC 4.9.0 中实现和发布
Old answer:
旧答案:
ECMAScript syntax accepts [0-9]
, \s
, \w
, etc, see ECMA-262 (15.10). Here's an example with boost::regex
that also uses the ECMAScript syntax by default:
ECMAScript 语法接受[0-9]
、\s
、\w
等,请参阅ECMA-262 (15.10)。这是一个boost::regex
默认情况下也使用 ECMAScript 语法的示例:
#include <boost/regex.hpp>
int main(int argc, char* argv[]) {
using namespace boost;
regex e("[0-9]");
return argc > 1 ? !regex_match(argv[1], e) : 2;
}
It works:
有用:
$ g++ -std=c++0x *.cc -lboost_regex && ./a.out 1
According to the C++11 standard (28.8.2) basic_regex()
uses regex_constants::ECMAScript
flag by default so it must understand this syntax.
根据 C++11 标准 (28.8.2)默认basic_regex()
使用regex_constants::ECMAScript
标志,因此它必须理解此语法。
Is this C++11 regex error me or the compiler?
这是 C++11 正则表达式错误我还是编译器?
gcc-4.6.1 doesn't support c++11 regular expressions (28.13).
回答by Some programmer dude
The error is because creating a regex by default uses ECMAScript syntax for the expression, which doesn't support brackets. You should declare the expression with the basic
or extended
flag:
错误是因为默认情况下创建正则表达式使用 ECMAScript 语法作为表达式,它不支持括号。您应该使用basic
orextended
标志声明表达式:
std::regex r4("[0-9]", std::regex_constants::basic);
EditSeems like libstdc++ (part of GCC, and the library that handles all C++ stuff) doesn't fully implement regular expressions yet. In their status documentthey say that Modified ECMAScript regular expression grammaris not implemented yet.
编辑似乎 libstdc++(GCC 的一部分,以及处理所有 C++ 内容的库)还没有完全实现正则表达式。在他们的状态文档中,他们说Modified ECMAScript 正则表达式语法尚未实现。
回答by Drew Noakes
Regex support improved between gcc 4.8.2 and 4.9.2. For example, the regex =[A-Z]{3}
was failing for me with:
在 gcc 4.8.2 和 4.9.2 之间改进了正则表达式支持。例如,正则表达式=[A-Z]{3}
对我来说失败了:
Regex error
正则表达式错误
After upgrading to gcc 4.9.2, it works as expected.
升级到 gcc 4.9.2 后,它按预期工作。