C++ 如何使用 std::regex 匹配多个结果

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/21667295/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-27 23:45:59  来源:igfitidea点击:

How to match multiple results using std::regex

c++regex

提问by AntiMoron

for example.If I have a string like"first second third forth"and I want to match each single word in one operation to output'em one by one.

例如,如果我有一个像“first second third Fifth”这样的字符串,并且我想在一个操作中匹配每个单词来一个一个地输出它们。

I just thought that "(\b\S*\b){0,}" would work.But actually it did not.

我只是认为 "(\b\S*\b){0,}" 会起作用。但实际上并没有。

What should I do?

我该怎么办?

Here's my code:

这是我的代码:

#include<iostream>
#include<string>
using namespace std;
int main()
{
    regex exp("(\b\S*\b)");
    smatch res;
    string str = "first second third forth";
    regex_search(str, res, exp);
    cout << res[0] <<" "<<res[1]<<" "<<res[2]<<" "<<res[3]<< endl;
}   

I'm looking forward to your kindly help. :)

我期待着您的帮助。:)

采纳答案by herohuyongtao

This can be done in regexof C++11.

这可以在regexof 中完成C++11

Two methos:

两种方法:

  1. You can use ()in regexto define your captures.

    Like this:

    string var = "first second third forth";
    
    const regex r("(.*) (.*) (.*) (.*)");  
    smatch sm;
    
    if (regex_search(var, sm, r)) {
        for (int i=1; i<sm.size(); i++) {
            cout << sm[i] << endl;
        }
    }
    

    See it live: http://coliru.stacked-crooked.com/a/e1447c4cff9ea3e7

  2. You can use sregex_token_iterator():

    string var = "first second third forth";
    
    regex wsaq_re("\s+"); 
    copy( sregex_token_iterator(var.begin(), var.end(), wsaq_re, -1),
        sregex_token_iterator(),
        ostream_iterator<string>(cout, "\n"));
    

    See it live: http://coliru.stacked-crooked.com/a/677aa6f0bb0612f0

  1. 您可以使用()inregex来定义您的捕获。

    像这样:

    string var = "first second third forth";
    
    const regex r("(.*) (.*) (.*) (.*)");  
    smatch sm;
    
    if (regex_search(var, sm, r)) {
        for (int i=1; i<sm.size(); i++) {
            cout << sm[i] << endl;
        }
    }
    

    现场观看:http: //coliru.stacked-crooked.com/a/e1447c4cff9ea3e7

  2. 您可以使用sregex_token_iterator()

    string var = "first second third forth";
    
    regex wsaq_re("\s+"); 
    copy( sregex_token_iterator(var.begin(), var.end(), wsaq_re, -1),
        sregex_token_iterator(),
        ostream_iterator<string>(cout, "\n"));
    

    现场观看:http: //coliru.stacked-crooked.com/a/677aa6f0bb0612f0

回答by St0fF

Simply iterate over your string while regex_searching, like this:

只需在 regex_searching 时迭代您的字符串,如下所示:

{
    regex exp("(\b\S*\b)");
    smatch res;
    string str = "first second third forth";

    string::const_iterator searchStart( str.cbegin() );
    while ( regex_search( searchStart, str.cend(), res, exp ) )
    {
        cout << ( searchStart == str.cbegin() ? "" : " " ) << res[0];  
        searchStart = res.suffix().first;
    }
    cout << endl;
}

回答by Mattia Fantoni

You could use the suffix() function, and search again until you don't find a match:

您可以使用 suffix() 函数,然后再次搜索,直到找不到匹配项:

int main()
{
    regex exp("(\b\S*\b)");
    smatch res;
    string str = "first second third forth";

    while (regex_search(str, res, exp)) {
        cout << res[0] << endl;
        str = res.suffix();
    }
}   

回答by Behrouz.M

feel free to use my code. It will capture all groups in all matches:

随意使用我的代码。它将捕获所有比赛中的所有组:

vector<vector<string>> U::String::findEx(const string& s, const string& reg_ex, bool case_sensitive)
{
    regex rx(reg_ex, case_sensitive ? regex_constants::icase : 0);
    vector<vector<string>> captured_groups;
    vector<string> captured_subgroups;
    const std::sregex_token_iterator end_i;
    for (std::sregex_token_iterator i(s.cbegin(), s.cend(), rx);
        i != end_i;
        ++i)
    {
        captured_subgroups.clear();
        string group = *i;
        smatch res;
        if(regex_search(group, res, rx))
        {
            for(unsigned i=0; i<res.size() ; i++)
                captured_subgroups.push_back(res[i]);

            if(captured_subgroups.size() > 0)
                captured_groups.push_back(captured_subgroups);
        }

    }
    captured_groups.push_back(captured_subgroups);
    return captured_groups;
}

回答by Peter Alfvin

My reading of the documentationis that regex_searchsearches for the first match and that none of the functions in std::regexdo a "scan" as you are looking for. However, the Boost library seems to be support this, as described in C++ tokenize a string using a regular expression

对文档的阅读是regex_search搜索第一个匹配项,并且没有任何函数std::regex按照您的要求进行“扫描”。但是,Boost 库似乎支持这一点,如C++ tokenize a string using a regular expression 中所述

回答by Steven

sregex_token_iterator appears to be the ideal, efficient solution, but the example given in the selected answer leaves much to be desired. Instead, I found some great examples here: http://www.cplusplus.com/reference/regex/regex_token_iterator/regex_token_iterator/

sregex_token_iterator 似乎是理想、高效的解决方案,但所选答案中给出的示例仍有很多不足之处。相反,我在这里找到了一些很好的例子:http: //www.cplusplus.com/reference/regex/regex_token_iterator/regex_token_iterator/

For your convenience, I've copy and pasted the sample code shown by that page. I claim no credit for the code.

为方便起见,我复制并粘贴了该页面显示的示例代码。我声称代码没有功劳。

// regex_token_iterator example
#include <iostream>
#include <string>
#include <regex>

int main ()
{
  std::string s ("this subject has a submarine as a subsequence");
  std::regex e ("\b(sub)([^ ]*)");   // matches words beginning by "sub"

  // default constructor = end-of-sequence:
  std::regex_token_iterator<std::string::iterator> rend;

  std::cout << "entire matches:"; 
  std::regex_token_iterator<std::string::iterator> a ( s.begin(), s.end(), e );
  while (a!=rend) std::cout << " [" << *a++ << "]";
  std::cout << std::endl;

  std::cout << "2nd submatches:";
  std::regex_token_iterator<std::string::iterator> b ( s.begin(), s.end(), e, 2 );
  while (b!=rend) std::cout << " [" << *b++ << "]";
  std::cout << std::endl;

  std::cout << "1st and 2nd submatches:";
  int submatches[] = { 1, 2 };
  std::regex_token_iterator<std::string::iterator> c ( s.begin(), s.end(), e, submatches );
  while (c!=rend) std::cout << " [" << *c++ << "]";
  std::cout << std::endl;

  std::cout << "matches as splitters:";
  std::regex_token_iterator<std::string::iterator> d ( s.begin(), s.end(), e, -1 );
  while (d!=rend) std::cout << " [" << *d++ << "]";
  std::cout << std::endl;

  return 0;
}

Output:
entire matches: [subject] [submarine] [subsequence]
2nd submatches: [ject] [marine] [sequence]
1st and 2nd submatches: [sub] [ject] [sub] [marine] [sub] [sequence]
matches as splitters: [this ] [ has a ] [ as a ]