C++ 解析逗号分隔的 std::string
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1894886/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Parsing a comma-delimited std::string
提问by Piku
If I have a std::string containing a comma-separated list of numbers, what's the simplest way to parse out the numbers and put them in an integer array?
如果我有一个包含以逗号分隔的数字列表的 std::string,那么解析数字并将它们放入整数数组的最简单方法是什么?
I don't want to generalise this out into parsing anything else. Just a simple string of comma separated integer numbers such as "1,1,1,1,2,1,1,1,0".
我不想将其概括为解析其他任何内容。只是一个简单的逗号分隔的整数字符串,例如“1,1,1,1,2,1,1,1,0”。
回答by user229321
Input one number at a time, and check whether the following character is ,
. If so, discard it.
一次输入一个数字,检查后面的字符是否为,
。如果是这样,丢弃它。
#include <vector>
#include <string>
#include <sstream>
#include <iostream>
int main()
{
std::string str = "1,2,3,4,5,6";
std::vector<int> vect;
std::stringstream ss(str);
for (int i; ss >> i;) {
vect.push_back(i);
if (ss.peek() == ',')
ss.ignore();
}
for (std::size_t i = 0; i < vect.size(); i++)
std::cout << vect[i] << std::endl;
}
回答by Zoomulator
Something less verbose, std and takes anything separated by a comma.
一些不那么冗长的东西,std 并采用逗号分隔的任何内容。
stringstream ss( "1,1,1,1, or something else ,1,1,1,0" );
vector<string> result;
while( ss.good() )
{
string substr;
getline( ss, substr, ',' );
result.push_back( substr );
}
回答by Jerry Coffin
Yet another, rather different, approach: use a special locale that treats commas as white space:
还有另一种相当不同的方法:使用将逗号视为空格的特殊语言环境:
#include <locale>
#include <vector>
struct csv_reader: std::ctype<char> {
csv_reader(): std::ctype<char>(get_table()) {}
static std::ctype_base::mask const* get_table() {
static std::vector<std::ctype_base::mask> rc(table_size, std::ctype_base::mask());
rc[','] = std::ctype_base::space;
rc['\n'] = std::ctype_base::space;
rc[' '] = std::ctype_base::space;
return &rc[0];
}
};
To use this, you imbue()
a stream with a locale that includes this facet. Once you've done that, you can read numbers as if the commas weren't there at all. Just for example, we'll read comma-delimited numbers from input, and write then out one-per line on standard output:
要使用它,您需要imbue()
一个包含此方面的区域设置的流。一旦你这样做了,你就可以读取数字,就好像逗号根本不存在一样。举个例子,我们将从输入中读取逗号分隔的数字,然后在标准输出上每行写出一个:
#include <algorithm>
#include <iterator>
#include <iostream>
int main() {
std::cin.imbue(std::locale(std::locale(), new csv_reader()));
std::copy(std::istream_iterator<int>(std::cin),
std::istream_iterator<int>(),
std::ostream_iterator<int>(std::cout, "\n"));
return 0;
}
回答by Jerry Coffin
The C++ String Toolkit Library (Strtk)has the following solution to your problem:
在C ++字符串工具箱库(Strtk)具有以下问题的解决方案:
#include <string>
#include <deque>
#include <vector>
#include "strtk.hpp"
int main()
{
std::string int_string = "1,2,3,4,5,6,7,8,9,10,11,12,13,14,15";
std::vector<int> int_list;
strtk::parse(int_string,",",int_list);
std::string double_string = "123.456|789.012|345.678|901.234|567.890";
std::deque<double> double_list;
strtk::parse(double_string,"|",double_list);
return 0;
}
More examples can be found Here
更多例子可以在这里找到
回答by TC.
Alternative solution using generic algorithms and Boost.Tokenizer:
使用通用算法和Boost.Tokenizer 的替代解决方案:
struct ToInt
{
int operator()(string const &str) { return atoi(str.c_str()); }
};
string values = "1,2,3,4,5,9,8,7,6";
vector<int> ints;
tokenizer<> tok(values);
transform(tok.begin(), tok.end(), back_inserter(ints), ToInt());
回答by kiamlaluno
You could also use the following function.
您还可以使用以下功能。
void tokenize(const string& str, vector<string>& tokens, const string& delimiters = ",")
{
// Skip delimiters at beginning.
string::size_type lastPos = str.find_first_not_of(delimiters, 0);
// Find first non-delimiter.
string::size_type pos = str.find_first_of(delimiters, lastPos);
while (string::npos != pos || string::npos != lastPos) {
// Found a token, add it to the vector.
tokens.push_back(str.substr(lastPos, pos - lastPos));
// Skip delimiters.
lastPos = str.find_first_not_of(delimiters, pos);
// Find next non-delimiter.
pos = str.find_first_of(delimiters, lastPos);
}
}
回答by Timmmm
Lots of pretty terrible answers here so I'll add mine (including test program):
这里有很多非常糟糕的答案,所以我将添加我的(包括测试程序):
#include <string>
#include <iostream>
#include <cstddef>
template<typename StringFunction>
void splitString(const std::string &str, char delimiter, StringFunction f) {
std::size_t from = 0;
for (std::size_t i = 0; i < str.size(); ++i) {
if (str[i] == delimiter) {
f(str, from, i);
from = i + 1;
}
}
if (from <= str.size())
f(str, from, str.size());
}
int main(int argc, char* argv[]) {
if (argc != 2)
return 1;
splitString(argv[1], ',', [](const std::string &s, std::size_t from, std::size_t to) {
std::cout << "`" << s.substr(from, to - from) << "`\n";
});
return 0;
}
Nice properties:
不错的属性:
- No dependencies (e.g. boost)
- Not an insane one-liner
- Easy to understand (I hope)
- Handles spaces perfectly fine
- Doesn't allocate splits if you don't want to, e.g. you can process them with a lambda as shown.
- Doesn't add characters one at a time - should be fast.
- If using C++17 you could change it to use a
std::stringview
and then it won't do any allocations and should be extremely fast.
- 无依赖性(例如提升)
- 不是疯狂的单线
- 易于理解(我希望)
- 完美处理空间
- 如果您不想,则不分配拆分,例如,您可以使用 lambda 处理它们,如图所示。
- 一次不添加一个字符 - 应该很快。
- 如果使用 C++17,您可以将其更改为使用 a
std::stringview
,然后它不会进行任何分配,并且应该非常快。
Some design choices you may wish to change:
您可能希望更改的一些设计选择:
- Empty entries are not ignored.
- An empty string will call f() once.
- 不会忽略空条目。
- 空字符串将调用 f() 一次。
Example inputs and outputs:
示例输入和输出:
"" -> {""}
"," -> {"", ""}
"1," -> {"1", ""}
"1" -> {"1"}
" " -> {" "}
"1, 2," -> {"1", " 2", ""}
" ,, " -> {" ", "", " "}
回答by Michael Krelin - hacker
std::string input="1,1,1,1,2,1,1,1,0";
std::vector<long> output;
for(std::string::size_type p0=0,p1=input.find(',');
p1!=std::string::npos || p0!=std::string::npos;
(p0=(p1==std::string::npos)?p1:++p1),p1=input.find(',',p0) )
output.push_back( strtol(input.c_str()+p0,NULL,0) );
It would be a good idea to check for conversion errors in strtol()
, of course. Maybe the code may benefit from some other error checks as well.
strtol()
当然,检查 中的转换错误是个好主意。也许代码也可以从其他一些错误检查中受益。
回答by Jonathan H
I'm surprised no one has proposed a solution using std::regex
yet:
我很惊讶没有人提出使用以下解决方案std::regex
:
#include <string>
#include <algorithm>
#include <vector>
#include <regex>
void parse_csint( const std::string& str, std::vector<int>& result ) {
typedef std::regex_iterator<std::string::const_iterator> re_iterator;
typedef re_iterator::value_type re_iterated;
std::regex re("(\d+)");
re_iterator rit( str.begin(), str.end(), re );
re_iterator rend;
std::transform( rit, rend, std::back_inserter(result),
[]( const re_iterated& it ){ return std::stoi(it[1]); } );
}
This function inserts all integers at the back of the input vector. You can tweak the regular expression to include negative integers, or floating point numbers, etc.
此函数在输入向量的后面插入所有整数。您可以调整正则表达式以包含负整数或浮点数等。
回答by Steve Jessop
#include <sstream>
#include <vector>
const char *input = "1,1,1,1,2,1,1,1,0";
int main() {
std::stringstream ss(input);
std::vector<int> output;
int i;
while (ss >> i) {
output.push_back(i);
ss.ignore(1);
}
}
Bad input (for instance consecutive separators) will mess this up, but you did say simple.
错误的输入(例如连续的分隔符)会搞砸,但你确实说的很简单。