C++ - 通过正则表达式拆分字符串
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/16749069/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
C++ - Split string by regex
提问by nothing-special-here
I want to split std::string
by regex
.
我想std::string
通过regex
.
I have found some solutions on Stackoverflow, but most of them are splitting string by single space or using external libraries like boost.
我在 Stackoverflow 上找到了一些解决方案,但其中大多数是按单个空格拆分字符串或使用 boost 等外部库。
I can't use boost.
我不能使用升压。
I want to split string by regex - "\\s+"
.
我想通过正则表达式 - 分割字符串"\\s+"
。
I am using this g++ version g++ (Debian 4.4.5-8) 4.4.5
and i can't upgrade.
我正在使用此 g++ 版本g++ (Debian 4.4.5-8) 4.4.5
,但无法升级。
采纳答案by shf301
You don't need to use regular expressions if you just want to split a string by multiple spaces. Writing your own regex library is overkill for something that simple.
如果您只想按多个空格拆分字符串,则不需要使用正则表达式。编写自己的正则表达式库对于这么简单的事情来说太过分了。
The answer you linked to in your comments, Split a string in C++?, can easily be changed so that it doesn't include any empty elements if there are multiple spaces.
您在评论中链接的答案,在 C++ 中拆分字符串?, 可以轻松更改,以便在有多个空格时不包含任何空元素。
std::vector<std::string> &split(const std::string &s, char delim,std::vector<std::string> &elems) {
std::stringstream ss(s);
std::string item;
while (std::getline(ss, item, delim)) {
if (item.length() > 0) {
elems.push_back(item);
}
}
return elems;
}
std::vector<std::string> split(const std::string &s, char delim) {
std::vector<std::string> elems;
split(s, delim, elems);
return elems;
}
By checking that item.length() > 0
before pushing item
on to the elems
vector you will no longer get extra elements if your input contains multiple delimiters (spaces in your case)
通过item.length() > 0
在推item
送到elems
向量之前检查,如果您的输入包含多个分隔符(在您的情况下为空格),您将不再获得额外的元素
回答by Pete Becker
std::regex rgx("\s+");
std::sregex_token_iterator iter(string_to_split.begin(),
string_to_split.end(),
rgx,
-1);
std::sregex_token_iterator end;
for ( ; iter != end; ++iter)
std::cout << *iter << '\n';
The -1
is the key here: when the iterator is constructed the iterator points at the text that precedes the match and after each increment the iterator points at the text that followed the previous match.
在-1
这里的关键是:当迭代器的文本构建的迭代点是之前的比赛和每个增量后,在随后的一次匹配文本的迭代点。
If you don't have C++11, the same thing should work with TR1 or (possibly with slight modification) with Boost.
如果你没有 C++11,同样的事情应该适用于 TR1 或(可能稍作修改)与 Boost。
回答by Marcin
To expand on the answer by @Pete Becker I provide an example of resplit function that can be used to split text using regexp:
为了扩展@Pete Becker 的答案,我提供了一个 resplit 函数示例,该函数可用于使用正则表达式拆分文本:
std::vector<std::string>
resplit(const std::string & s, std::string rgx_str = "\s+") {
std::vector<std::string> elems;
std::regex rgx (rgx_str);
std::sregex_token_iterator iter(s.begin(), s.end(), rgx, -1);
std::sregex_token_iterator end;
while (iter != end) {
//std::cout << "S43:" << *iter << std::endl;
elems.push_back(*iter);
++iter;
}
return elems;
}
This works as follows:
其工作原理如下:
string s1 = "first second third ";
vector<string> v22 = my::resplit(s1);
for (const auto & e: v22) {
cout <<"Token:" << e << endl;
}
//Token:first
//Token:second
//Token:third
string s222 = "first|second:third,forth";
vector<string> v222 = my::resplit(s222, "[|:,]");
for (const auto & e: v222) {
cout <<"Token:" << e << endl;
}
//Token:first
//Token:second
//Token:third
//Token:forth
回答by solstice333
string s = "foo bar baz";
regex e("\s+");
regex_token_iterator<string::iterator> i(s.begin(), s.end(), e, -1);
regex_token_iterator<string::iterator> end;
while (i != end)
cout << " [" << *i++ << "]";
prints [foo] [bar] [baz]
印刷 [foo] [bar] [baz]