C++ 有条件地替换字符串中的正则表达式匹配项

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/11508798/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-27 15:12:34  来源:igfitidea点击:

Conditionally replace regex matches in string

c++regexvisual-studio-2010visual-c++c++11

提问by pstrjds

I am trying to replace certain patterns in a string with different replacement patters.

我试图用不同的替换模式替换字符串中的某些模式。

Example:

例子:

string test = "test replacing \"these characters\"";

What I want to do is replace all ' ' with '_' and all other non letter or number characters with an empty string. I have the following regex created and it seems to tokenize correctly, but I am not sure how to (if possible) perform a conditional replace using regex_replace.

我想要做的是将所有 ' ' 替换为 '_',并将所有其他非字母或数字字符替换为空字符串。我创建了以下正则表达式,它似乎正确标记化,但我不确定如何(如果可能)使用regex_replace.

string test = "test replacing \"these characters\"";
regex reg("(\s+)|(\W+)");

expected result after replace would be:

替换后的预期结果是:

string result = "test_replacing_these_characters";

EDIT: I cannot use boost, which is why I left it out of the tags. So please no answer that includes boost. I have to do this with the standard library. It may be that a different regex would accomplish the goal or that I am just stuck doing two passes.

编辑:我不能使用 boost,这就是为什么我把它排除在标签之外。所以请不要回答包括提升。我必须用标准库来做到这一点。可能是不同的正则表达式可以实现目标,或者我只是坚持做两次传球。

EDIT2: I did not remember what characters were included in \wat the time of my original regex, after looking it up I have further simplified the expression. Again the goal is anything matching \s+ should be replaced with '_' and anything matching \W+ should be replaced with empty string.

EDIT2:我不记得在\w我的原始正则表达式中包含了哪些字符,在查找之后我进一步简化了表达式。同样,目标是任何匹配 \s+ 的内容都应替换为 '_',任何匹配的 \W+ 都应替换为空字符串。

回答by rubber boots

The c++ (0x, 11, tr1) regular expressions do not really work (stackoverflow)in every case (look up the phrase regexon this pagefor gcc), so it is better to use boostfor a while.

c++ (0x, 11, tr1) 正则表达式并不是在每种情况下都真正起作用(stackoverflow)此页面上查找 gcc 中的短语regex),因此最好暂时使用 boost

You may try if your compiler supports the regular expressions needed:

您可以尝试您的编译器是否支持所需的正则表达式:

#include <string>
#include <iostream>
#include <regex>

using namespace std;

int main(int argc, char * argv[]) {
    string test = "test replacing \"these characters\"";
    regex reg("[^\w]+");
    test = regex_replace(test, reg, "_");
    cout << test << endl;
}

The above works in Visual Studio 2012Rc.

以上在 Visual Studio 2012Rc 中有效。

Edit 1: To replace by two different stringsin one pass (depending on the match), I'd think this won't work here. In Perl, this could easily be done within evaluated replacement expressions (/eswitch).

编辑 1:要在一次传递中替换为两个不同的字符串(取决于匹配),我认为这在这里不起作用。在 Perl 中,这可以在已评估的替换表达式 ( /eswitch) 中轻松完成。

Therefore, you'll need two passes, as you already suspected:

因此,正如您已经怀疑的那样,您需要两次通行证:

 ...
 string test = "test replacing \"these characters\"";
 test = regex_replace(test, regex("\s+"), "_");
 test = regex_replace(test, regex("\W+"), "");
 ...

Edit 2:

编辑2

If it would be possible to use a callback functiontr()in regex_replace, then you could modify the substitution there, like:

如果有可能使用一个回调函数tr()regex_replace,那么你可以修改替换出现,如:

 string output = regex_replace(test, regex("\s+|\W+"), tr);

with tr()doing the replacement work:

tr()做更换工作:

 string tr(const smatch &m) { return m[0].str()[0] == ' ' ? "_" : ""; }

the problem would have been solved. Unfortunately, there's no such overloadin some C++11 regex implementations, but Boost has one. The following would work with boost and use one pass:

问题本来可以解决的。不幸的是,在某些 C++11 正则表达式实现中没有这样的重载,但 Boost有一个. 以下将与 boost 一起使用并使用一次传递:

...
#include <boost/regex.hpp>
using namespace boost;
...
string tr(const smatch &m) { return m[0].str()[0] == ' ' ? "_" : ""; }
...

string test = "test replacing \"these characters\"";
test = regex_replace(test, regex("\s+|\W+"), tr);   // <= works in Boost
...

Maybe some day this will work with C++11or whatever number comes next.

也许有一天这将适用于 C++ 11或接下来出现的任何数字。

Regards

问候

rbo

红包