使用 C++11 拆分字符串
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/9435385/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Split a string using C++11
提问by Mark
What would be easiest method to split a string using c++11?
使用 c++11 拆分字符串的最简单方法是什么?
I've seen the method used by this post, but I feel that there ought to be a less verbose way of doing it using the new standard.
我看过这篇文章使用的方法,但我觉得使用新标准应该有一种不那么冗长的方法。
Edit: I would like to have a vector<string>
as a result and be able to delimitate on a single character.
编辑:我希望有一个vector<string>
结果并且能够分隔单个字符。
回答by JohannesD
std::regex_token_iterator
performs generic tokenization based on a regex. It may or may not be overkill for doing simple splitting on a single character, but it works and is not too verbose:
std::regex_token_iterator
基于正则表达式执行通用标记化。对单个字符进行简单拆分可能会也可能不会过度,但它可以工作并且不会太冗长:
std::vector<std::string> split(const string& input, const string& regex) {
// passing -1 as the submatch index parameter performs splitting
std::regex re(regex);
std::sregex_token_iterator
first{input.begin(), input.end(), re, -1},
last;
return {first, last};
}
回答by Yaguang
Here is a (maybe less verbose) way to split string (based on the postyou mentioned).
这是一种(可能不那么冗长)拆分字符串的方法(基于您提到的帖子)。
#include <string>
#include <sstream>
#include <vector>
std::vector<std::string> split(const std::string &s, char delim) {
std::stringstream ss(s);
std::string item;
std::vector<std::string> elems;
while (std::getline(ss, item, delim)) {
elems.push_back(item);
// elems.push_back(std::move(item)); // if C++11 (based on comment from @mchiasson)
}
return elems;
}
回答by fduff
Here's an example of splitting a string and populating a vector with the extracted elements using boost
.
下面是一个使用boost
.
#include <boost/algorithm/string.hpp>
std::string my_input("A,B,EE");
std::vector<std::string> results;
boost::algorithm::split(results, my_input, boost::is_any_of(","));
assert(results[0] == "A");
assert(results[1] == "B");
assert(results[2] == "EE");
回答by wally
Another regex solution inspired by other answersbut hopefully shorter and easier to read:
另一个受其他答案启发的正则表达式解决方案,但希望更短更容易阅读:
std::string s{"String to split here, and here, and here,..."};
std::regex regex{R"([\s,]+)"}; // split on space and comma
std::sregex_token_iterator it{s.begin(), s.end(), regex, -1};
std::vector<std::string> words{it, {}};
回答by Faisal Vali
I don't know if this is less verbose, but it might be easier to grok for those more seasoned in dynamic languages such as javascript. The only C++11 feature it uses is lambdas.
我不知道这是否不那么冗长,但对于那些在动态语言(例如 javascript)方面经验丰富的人来说,可能更容易理解。它使用的唯一 C++11 特性是 lambdas。
#include <algorithm>
#include <string>
#include <cctype>
#include <iostream>
#include <vector>
int main()
{
using namespace std;
string s = "hello how are you won't you tell me your name";
vector<string> tokens;
string token;
for_each(s.begin(), s.end(), [&](char c) {
if (!isspace(c))
token += c;
else
{
if (token.length()) tokens.push_back(token);
token.clear();
}
});
if (token.length()) tokens.push_back(token);
return 0;
}
回答by Torsten
My choice is boost::tokenizer
but I didn't have any heavy tasks and test with huge data.
Example from boost doc with lambda modification:
我的选择是boost::tokenizer
但我没有任何繁重的任务并使用大量数据进行测试。来自带有 lambda 修改的 boost 文档的示例:
#include <iostream>
#include <boost/tokenizer.hpp>
#include <string>
#include <vector>
int main()
{
using namespace std;
using namespace boost;
string s = "This is, a test";
vector<string> v;
tokenizer<> tok(s);
for_each (tok.begin(), tok.end(), [&v](const string & s) { v.push_back(s); } );
// result 4 items: 1)This 2)is 3)a 4)test
return 0;
}
回答by chekkal
#include <iostream>
#include <algorithm>
#include <vector>
#include <string>
using namespace std;
vector<string> split(const string& str, int delimiter(int) = ::isspace){
vector<string> result;
auto e=str.end();
auto i=str.begin();
while(i!=e){
i=find_if_not(i,e, delimiter);
if(i==e) break;
auto j=find_if(i,e, delimiter);
result.push_back(string(i,j));
i=j;
}
return result;
}
int main(){
string line;
getline(cin,line);
vector<string> result = split(line);
for(auto s: result){
cout<<s<<endl;
}
}
回答by ymmt2005
This is my answer. Verbose, readable and efficient.
这是我的回答。冗长,可读和高效。
std::vector<std::string> tokenize(const std::string& s, char c) {
auto end = s.cend();
auto start = end;
std::vector<std::string> v;
for( auto it = s.cbegin(); it != end; ++it ) {
if( *it != c ) {
if( start == end )
start = it;
continue;
}
if( start != end ) {
v.emplace_back(start, it);
start = end;
}
}
if( start != end )
v.emplace_back(start, end);
return v;
}
回答by villains
Here is a C++11 solution that uses only std::string::find(). The delimiter can be any number of characters long. Parsed tokens are output via an output iterator, which is typically a std::back_inserter in my code.
这是一个仅使用 std::string::find() 的 C++11 解决方案。分隔符可以是任意数量的字符。解析的标记通过输出迭代器输出,在我的代码中它通常是 std::back_inserter。
I have not tested this with UTF-8, but I expect it should work as long as the input and delimiter are both valid UTF-8 strings.
我没有用 UTF-8 测试过这个,但我希望它应该可以工作,只要输入和分隔符都是有效的 UTF-8 字符串。
#include <string>
template<class Iter>
Iter splitStrings(const std::string &s, const std::string &delim, Iter out)
{
if (delim.empty()) {
*out++ = s;
return out;
}
size_t a = 0, b = s.find(delim);
for ( ; b != std::string::npos;
a = b + delim.length(), b = s.find(delim, a))
{
*out++ = std::move(s.substr(a, b - a));
}
*out++ = std::move(s.substr(a, s.length() - a));
return out;
}
Some test cases:
一些测试用例:
void test()
{
std::vector<std::string> out;
size_t counter;
std::cout << "Empty input:" << std::endl;
out.clear();
splitStrings("", ",", std::back_inserter(out));
counter = 0;
for (auto i = out.begin(); i != out.end(); ++i, ++counter) {
std::cout << counter << ": " << *i << std::endl;
}
std::cout << "Non-empty input, empty delimiter:" << std::endl;
out.clear();
splitStrings("Hello, world!", "", std::back_inserter(out));
counter = 0;
for (auto i = out.begin(); i != out.end(); ++i, ++counter) {
std::cout << counter << ": " << *i << std::endl;
}
std::cout << "Non-empty input, non-empty delimiter"
", no delimiter in string:" << std::endl;
out.clear();
splitStrings("abxycdxyxydefxya", "xyz", std::back_inserter(out));
counter = 0;
for (auto i = out.begin(); i != out.end(); ++i, ++counter) {
std::cout << counter << ": " << *i << std::endl;
}
std::cout << "Non-empty input, non-empty delimiter"
", delimiter exists string:" << std::endl;
out.clear();
splitStrings("abxycdxy!!xydefxya", "xy", std::back_inserter(out));
counter = 0;
for (auto i = out.begin(); i != out.end(); ++i, ++counter) {
std::cout << counter << ": " << *i << std::endl;
}
std::cout << "Non-empty input, non-empty delimiter"
", delimiter exists string"
", input contains blank token:" << std::endl;
out.clear();
splitStrings("abxycdxyxydefxya", "xy", std::back_inserter(out));
counter = 0;
for (auto i = out.begin(); i != out.end(); ++i, ++counter) {
std::cout << counter << ": " << *i << std::endl;
}
std::cout << "Non-empty input, non-empty delimiter"
", delimiter exists string"
", nothing after last delimiter:" << std::endl;
out.clear();
splitStrings("abxycdxyxydefxy", "xy", std::back_inserter(out));
counter = 0;
for (auto i = out.begin(); i != out.end(); ++i, ++counter) {
std::cout << counter << ": " << *i << std::endl;
}
std::cout << "Non-empty input, non-empty delimiter"
", only delimiter exists string:" << std::endl;
out.clear();
splitStrings("xy", "xy", std::back_inserter(out));
counter = 0;
for (auto i = out.begin(); i != out.end(); ++i, ++counter) {
std::cout << counter << ": " << *i << std::endl;
}
}
Expected output:
预期输出:
Empty input: 0: Non-empty input, empty delimiter: 0: Hello, world! Non-empty input, non-empty delimiter, no delimiter in string: 0: abxycdxyxydefxya Non-empty input, non-empty delimiter, delimiter exists string: 0: ab 1: cd 2: !! 3: def 4: a Non-empty input, non-empty delimiter, delimiter exists string, input contains blank token: 0: ab 1: cd 2: 3: def 4: a Non-empty input, non-empty delimiter, delimiter exists string, nothing after last delimiter: 0: ab 1: cd 2: 3: def 4: Non-empty input, non-empty delimiter, only delimiter exists string: 0: 1:
回答by Bill Moore
#include <string>
#include <vector>
#include <sstream>
inline vector<string> split(const string& s) {
vector<string> result;
istringstream iss(s);
for (string w; iss >> w; )
result.push_back(w);
return result;
}