C++ - 通过正则表达式拆分字符串

Question

提问by nothing-special-here

I want to split std::stringby regex.

我想std::string通过regex.

I have found some solutions on Stackoverflow, but most of them are splitting string by single space or using external libraries like boost.

我在 Stackoverflow 上找到了一些解决方案，但其中大多数是按单个空格拆分字符串或使用 boost 等外部库。

I can't use boost.

我不能使用升压。

I want to split string by regex - "\\s+".

我想通过正则表达式 - 分割字符串"\\s+"。

I am using this g++ version g++ (Debian 4.4.5-8) 4.4.5and i can't upgrade.

我正在使用此 g++ 版本g++ (Debian 4.4.5-8) 4.4.5，但无法升级。

Answer 1

采纳答案by shf301

You don't need to use regular expressions if you just want to split a string by multiple spaces. Writing your own regex library is overkill for something that simple.

如果您只想按多个空格拆分字符串，则不需要使用正则表达式。编写自己的正则表达式库对于这么简单的事情来说太过分了。

The answer you linked to in your comments, Split a string in C++?, can easily be changed so that it doesn't include any empty elements if there are multiple spaces.

您在评论中链接的答案，在 C++ 中拆分字符串？, 可以轻松更改，以便在有多个空格时不包含任何空元素。

std::vector<std::string> &split(const std::string &s, char delim,std::vector<std::string> &elems) {
    std::stringstream ss(s);
    std::string item;
    while (std::getline(ss, item, delim)) {
        if (item.length() > 0) {
            elems.push_back(item);  
        }
    }
    return elems;
}


std::vector<std::string> split(const std::string &s, char delim) {
    std::vector<std::string> elems;
    split(s, delim, elems);
    return elems;
}

By checking that item.length() > 0before pushing itemon to the elemsvector you will no longer get extra elements if your input contains multiple delimiters (spaces in your case)

通过item.length() > 0在推item送到elems向量之前检查，如果您的输入包含多个分隔符（在您的情况下为空格），您将不再获得额外的元素

Answer 2

回答by Pete Becker

std::regex rgx("\s+");
std::sregex_token_iterator iter(string_to_split.begin(),
    string_to_split.end(),
    rgx,
    -1);
std::sregex_token_iterator end;
for ( ; iter != end; ++iter)
    std::cout << *iter << '\n';

The -1is the key here: when the iterator is constructed the iterator points at the text that precedes the match and after each increment the iterator points at the text that followed the previous match.

在-1这里的关键是：当迭代器的文本构建的迭代点是之前的比赛和每个增量后，在随后的一次匹配文本的迭代点。

If you don't have C++11, the same thing should work with TR1 or (possibly with slight modification) with Boost.

如果你没有 C++11，同样的事情应该适用于 TR1 或（可能稍作修改）与 Boost。

Answer 3

回答by Marcin

To expand on the answer by @Pete Becker I provide an example of resplit function that can be used to split text using regexp:

为了扩展@Pete Becker 的答案，我提供了一个 resplit 函数示例，该函数可用于使用正则表达式拆分文本：

  std::vector<std::string>
  resplit(const std::string & s, std::string rgx_str = "\s+") {


      std::vector<std::string> elems;

      std::regex rgx (rgx_str);

      std::sregex_token_iterator iter(s.begin(), s.end(), rgx, -1);
      std::sregex_token_iterator end;

      while (iter != end)  {
          //std::cout << "S43:" << *iter << std::endl;
          elems.push_back(*iter);
          ++iter;
      }

      return elems;

  }

This works as follows:

其工作原理如下：

   string s1 = "first   second third    ";
   vector<string> v22 = my::resplit(s1);

   for (const auto & e: v22) {
       cout <<"Token:" << e << endl;
   }


   //Token:first
   //Token:second
   //Token:third


   string s222 = "first|second:third,forth";
   vector<string> v222 = my::resplit(s222, "[|:,]");

   for (const auto & e: v222) {
       cout <<"Token:" << e << endl;
   }


   //Token:first
   //Token:second
   //Token:third
   //Token:forth

Answer 4

回答by solstice333

string s = "foo bar  baz";
regex e("\s+");
regex_token_iterator<string::iterator> i(s.begin(), s.end(), e, -1);
regex_token_iterator<string::iterator> end;
while (i != end)
   cout << " [" << *i++ << "]";

prints [foo] [bar] [baz]

印刷 [foo] [bar] [baz]

C++ - 通过正则表达式拆分字符串

提问by nothing-special-here

采纳答案by shf301

回答by Pete Becker

回答by Marcin

回答by solstice333

相关推荐

最近更新

标签

C++ - 通过正则表达式拆分字符串

提问by nothing-special-here

采纳答案by shf301

回答by Pete Becker

回答by Marcin

回答by solstice333

相关推荐

C++ 矢量不是模板？

C++ 在堆栈和堆上创建对象数组

跟踪在 C++ 中调用递归函数的次数

如何将 .hex 文件反编译为 Arduino 的 C++？

相关推荐

最近更新

标签