C++ 通过一个简单的例子理解c++正则表达式

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/30921932/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-28 13:50:27  来源:igfitidea点击:

Understanding c++ regex by a simple example

c++regex

提问by

I wrote the following simple example:

我写了以下简单的例子:

#include <iostream>
#include <string>
#include <regex>

int main ()
{
    std::string str("1231");
    std::regex r("^(\d)");
    std::smatch m;
    std::regex_search(str, m, r);
    for(auto v: m) std::cout << v << std::endl;
}

DEMO

演示

and got confused by its behavior. If I understood the purpose of the match_resultfrom therecorrectly, the only one 1should have been printed. Actually:

并对其行为感到困惑。如果我match_result那里正确理解了 的目的,那么1应该打印唯一一个。实际上:

If successful, it is not empty and contains a series of sub_match objects: the first sub_match element corresponds to the entire match,and, if the regex expression contained sub-expressions to be matched ([...])

如果成功,则它不为空并包含一系列 sub_match 对象:第一个 sub_match 元素对应整个匹配,并且,如果正则表达式包含要匹配的子表达式 ([...])

The string passed to the function doesn't match the regex, therefore we should nothave had the entire match.

传递给函数的字符串与正则表达式不匹配,因此我们应该有the entire match.

What did I miss?

我错过了什么?

采纳答案by Galik

You still get the entire matchbut the entire matchdoes not fit the entire stringit fits the entire regex.

你仍然得到整个匹配,整个匹配不适合整个字符串它适合整个 regex

For example consider this:

例如考虑这个:

#include <iostream>
#include <string>
#include <regex>

int main()
{
    std::string str("1231");
    std::regex r("^(\d)\d"); // entire match will be 2 numbers

    std::smatch m;
    std::regex_search(str, m, r);

    for(auto v: m)
        std::cout << v << std::endl;
}

Output:

输出:

12
1

The entire match(first sub_match) is what the entire regexmatches against (part of the string).

整场比赛(第一sub_match)是什么,整个正则表达式对(字符串的一部分)相匹配。

The second sub_match is the first (and only) capture group

第二个 sub_match 是第一个(也是唯一一个)捕获组

Looking at your original regex

查看您的原始正则表达式

std::regex r("^(\d)");
              |----| <- entire expression (sub_match #0)

std::regex r("^(\d)");
               |---| <- first capture group (sub_match #1)

That is where the two sub_matchescome from.

这就是两个sub_matches 的来源。

回答by shockawave123

From here

这里

    Returns whether **some** sub-sequence in the target sequence (the subject) 
    matches the regular expression rgx (the pattern). The target sequence is 
    either s or the character sequence between first and last, depending on 
    the version used.

So regex_search will search for anything in the input string that matches the regex. The whole string doesnt have to match, just part of it.

因此 regex_search 将搜索输入字符串中与正则表达式匹配的任何内容。整个字符串不必匹配,只需匹配其中的一部分。

However, if you were to use regex_match, then the entirestring must match.

但是,如果您要使用 regex_match,则整个字符串必须匹配。