C++ 解析 Qt 中的 csv 文件
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/27318631/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Parsing through a csv file in Qt
提问by user3878223
Is anyone familiar with how to parse through a csv file and put it inside a string list. Right now I am taking the entire csv file and putting into the string list. I am trying to figure out if there is a way to get only the first column.
是否有人熟悉如何解析 csv 文件并将其放入字符串列表中。现在我将整个 csv 文件放入字符串列表中。我想弄清楚是否有办法只获取第一列。
#include "searchwindow.h"
#include <QtGui/QApplication>
#include <QApplication>
#include <QStringList>
#include <QLineEdit>
#include <QCompleter>
#include <QHBoxLayout>
#include <QWidget>
#include <QLabel>
#include <qfile.h>
#include <QTextStream>
int main(int argc, char *argv[])
{
QApplication a(argc, argv);
QWidget *widget = new QWidget();
QHBoxLayout *layout = new QHBoxLayout();
QStringList wordList;
QFile f("FlightParam.csv");
if (f.open(QIODevice::ReadOnly))
{
//file opened successfully
QString data;
data = f.readAll();
wordList = data.split(',');
f.close();
}
QLabel *label = new QLabel("Select");
QLineEdit *lineEdit = new QLineEdit;
label->setBuddy(lineEdit);
QCompleter *completer = new QCompleter(wordList);
completer->setCaseSensitivity(Qt::CaseInsensitive); //Make caseInsensitive selection
lineEdit->setCompleter(completer);
layout->addWidget(label);
layout->addWidget(lineEdit);
widget->setLayout(layout);
widget->showMaximized();
return a.exec();
}
回答by lpapp
There you go:
你去吧:
FlightParam.csv
飞行参数.csv
1,2,3,
4,5,6,
7,8,9,
main.cpp
主程序
#include <QFile>
#include <QStringList>
#include <QDebug>
int main()
{
QFile file("FlightParam.csv");
if (!file.open(QIODevice::ReadOnly)) {
qDebug() << file.errorString();
return 1;
}
QStringList wordList;
while (!file.atEnd()) {
QByteArray line = file.readLine();
wordList.append(line.split(',').first());
}
qDebug() << wordList;
return 0;
}
main.pro
主程序
TEMPLATE = app
TARGET = main
QT = core
SOURCES += main.cpp
Build and Run
构建和运行
qmake && make && ./main
Output
输出
("1", "4", "7")
回答by Shf
What you are looking for is a QTextStreamclass. It provides all kind of interfaces for reading and writing files.
您正在寻找的是QTextStream类。它提供了各种读写文件的接口。
A simple example:
一个简单的例子:
QStringList firstColumn;
QFile f1("h:/1.txt");
f1.open(QIODevice::ReadOnly);
QTextStream s1(&f1);
while (!s1.atEnd()){
QString s=s1.readLine(); // reads line from file
firstColumn.append(s.split(",").first()); // appends first column to list, ',' is separator
}
f1.close();
Alternatively yes, you can do something like this which would have the same result:
或者,是的,您可以执行以下操作,结果相同:
wordList = f.readAll().split(QRegExp("[\r\n]"),QString::SkipEmptyParts); //reading file and splitting it by lines
for (int i=0;i<wordList.count();i++)
wordList[i]=wordlist[i].split(",").first(); // replacing whole row with only first value
f.close();
回答by Jason C
Here is the code I usually use. I'm the author, consider this as-is, public domain. It has a similar feature-set and concept as CodeLurker's codeexcept the state machine is represented differently, the code is a bit shorter.
这是我通常使用的代码。我是作者,按原样考虑,公共领域。它与CodeLurker 的代码具有相似的功能集和概念,只是状态机的表示方式不同,代码更短一些。
bool readCSVRow (QTextStream &in, QStringList *row) {
static const int delta[][5] = {
// , " \n ? eof
{ 1, 2, -1, 0, -1 }, // 0: parsing (store char)
{ 1, 2, -1, 0, -1 }, // 1: parsing (store column)
{ 3, 4, 3, 3, -2 }, // 2: quote entered (no-op)
{ 3, 4, 3, 3, -2 }, // 3: parsing inside quotes (store char)
{ 1, 3, -1, 0, -1 }, // 4: quote exited (no-op)
// -1: end of row, store column, success
// -2: eof inside quotes
};
row->clear();
if (in.atEnd())
return false;
int state = 0, t;
char ch;
QString cell;
while (state >= 0) {
if (in.atEnd())
t = 4;
else {
in >> ch;
if (ch == ',') t = 0;
else if (ch == '\"') t = 1;
else if (ch == '\n') t = 2;
else if (ch == '\r') continue;
else t = 3;
}
state = delta[state][t];
switch (state) {
case 0:
case 3:
cell += ch;
break;
case -1:
case 1:
row->append(cell);
cell = "";
break;
}
}
if (state == -2)
throw runtime_error("End-of-file found while inside quotes.");
return true;
}
- Parameter:
in
, aQTextStream
. - Parameter:
row
, aQStringList
that will receive the row. - Returns:
true
if a row was read,false
if EOF. - Throws:
std::runtime_error
if an error occurs.
- 参数:
in
, aQTextStream
。 - 参数:
row
, aQStringList
将接收行。 - 返回:
true
如果读取了一行,false
如果是 EOF。 - 抛出:
std::runtime_error
如果发生错误。
It parses Excel style CSV's, handling quotes and double-quotes appropriately, and allows newlines in fields. Handles Windows and Unix line endings properly as long as your file is opened with QFile::Text
. I don't think Qt supports old-school Mac line endings, and this doesn't support binary-mode untranslated line-endings, but for the most part this shouldn't be a problem these days.
它解析 Excel 样式的 CSV,适当地处理引号和双引号,并允许在字段中换行。只要您的文件以QFile::Text
. 我认为 Qt 不支持老式 Mac 行尾,并且不支持二进制模式未翻译的行尾,但在大多数情况下,现在这应该不是问题。
Other notes:
其他注意事项:
- Unlike CodeLurker's implementation this intentionally fails if EOF is hit inside quotes. If you change the -2's to -1's in the state table then it will be forgiving.
- Parses
x"y"z
asxyz
, wasn't sure what the rule for mid-string quotes was. I have no idea if this is correct. - Performance and memory characteristics the same as CodeLurker's (i.e. very good).
- Does not support unicode (converts to ISO-5589-1) but changing to
QChar
should be trivial.
- 与 CodeLurker 的实现不同,如果 EOF 在引号内被击中,这会故意失败。如果您将状态表中的 -2 更改为 -1,那么它将是宽容的。
- 解析
x"y"z
为xyz
,不确定中间字符串引号的规则是什么。我不知道这是否正确。 - 性能和内存特性与 CodeLurker 相同(即非常好)。
- 不支持 unicode (转换为 ISO-5589-1)但更改为
QChar
应该是微不足道的。
Example:
例子:
QFile csv(filename);
csv.open(QFile::ReadOnly | QFile::Text);
QTextStream in(&csv);
QStringList row;
while (readCSVRow(in, &row))
qDebug() << row;
回答by CodeLurker
One might prefer to do it this way:
人们可能更喜欢这样做:
QStringList MainWindow::parseCSV(const QString &string)
{
enum State {Normal, Quote} state = Normal;
QStringList fields;
QString value;
for (int i = 0; i < string.size(); i++)
{
const QChar current = string.at(i);
// Normal state
if (state == Normal)
{
// Comma
if (current == ',')
{
// Save field
fields.append(value.trimmed());
value.clear();
}
// Double-quote
else if (current == '"')
{
state = Quote;
value += current;
}
// Other character
else
value += current;
}
// In-quote state
else if (state == Quote)
{
// Another double-quote
if (current == '"')
{
if (i < string.size())
{
// A double double-quote?
if (i+1 < string.size() && string.at(i+1) == '"')
{
value += '"';
// Skip a second quote character in a row
i++;
}
else
{
state = Normal;
value += '"';
}
}
}
// Other character
else
value += current;
}
}
if (!value.isEmpty())
fields.append(value.trimmed());
// Quotes are left in until here; so when fields are trimmed, only whitespace outside of
// quotes is removed. The quotes are removed here.
for (int i=0; i<fields.size(); ++i)
if (fields[i].length()>=1 && fields[i].left(1)=='"')
{
fields[i]=fields[i].mid(1);
if (fields[i].length()>=1 && fields[i].right(1)=='"')
fields[i]=fields[i].left(fields[i].length()-1);
}
return fields;
}
- Powerful: handles quoted material with commas, double double quotes (which signify a double-quote character) and whitespace right
- Flexible: doesn't fail if the last quote on the last string is forgotten, and handles more complicated CSV files; lets you process one line at a time without having to read the whole file in memory first
- Simple: Just drop this state machine in yer code, right-click on the function name in QtCreator and choose Refactor | Add private declaration, and yer good 2 go.
- Performant: accurately processes CSV lines faster than doing RegEx look-aheads on each character
- Convenient: requires no external library
- Easy to read: The code is intuitive, in case U need 2 modify it.
- 功能强大:处理带逗号、双双引号(表示双引号字符)和空格的引用材料
- 灵活:如果忘记了最后一个字符串的最后一个引号,也不会失败,并且可以处理更复杂的 CSV 文件;让您一次处理一行,而不必先读取内存中的整个文件
- 简单:只需将这个状态机放到你的代码中,在 QtCreator 中右键单击函数名称并选择 Refactor | 添加私人声明,然后你好 2 去。
- 高性能:准确处理 CSV 行比对每个字符执行 RegEx 预读更快
- 方便:不需要外部库
- 易于阅读:代码直观,以防您需要 2 修改它。
Edit: I've finally got around to getting this to trim spaces before and after the fields. No whitespace nor commas are trimmed inside quotes. Otherwise, all whitespace is trimmed from the start and end of a field. After puzzling about this for a while, I hit on the idea that the quotes could be left around the field; and so all fields could be trimmed. That way, only whitespace before and after quotes or text is removed. A final step was then added, to strip out quotes for fields that start and end with quotes.
编辑:我终于有时间用它来修剪字段前后的空间。引号内没有修剪空格或逗号。否则,将从字段的开头和结尾修剪所有空白。对此困惑了一段时间后,我想到了可以将引号留在现场的想法;因此可以修剪所有字段。这样,只会删除引号或文本前后的空格。然后添加了最后一步,去除以引号开头和结尾的字段的引号。
Here is a more or less challenging test case:
这是一个或多或少具有挑战性的测试用例:
QStringList sl=
{
"\"one\"",
" \" two \"\"\" , \" and a half ",
"three ",
"\t four"
};
for (int i=0; i < sl.size(); ++i)
qDebug() << parseCSV(sl[i]);
This corresponds to the file
这对应于文件
"one"
" two """ , " and a half
three
<TAB> four
where <TAB> represents the tab character; and each line is fed into parseCSV() in turn. DON'T write .csv files like this!
其中 <TAB> 代表制表符;并且每一行依次输入到 parseCSV() 中。不要写这样的 .csv 文件!
Its output is (where qDebug() is representing quotes in the string with \"
and putting things in quotes and parens):
它的输出是(其中 qDebug() 表示字符串\"
中的引号,并将内容放在引号和括号中):
("one")
(" two \"", " and a half")
("three")
("four")
You can observe that the quote and the extra spaces were preserved inside the quote for item "two". In the malformed case for "and a half", the space before the quote, and those after the last word, were removed; but the others were not. Missing terminal spaces in this routine could be an indication of a missing terminal quote. Quotes in a field that don't start or end it are just treated as part of a string. A quote isn't removed from the end of a field if one doesn't start it. To detect an error here, just check for a field that starts with a quote, but doesn't end with one; and/or one that contains quotes but doesn't start and end with one, in the final loop.
您可以观察到引号和额外的空格保留在项目“二”的引号内。在“半”的格式错误的情况下,引用之前的空格和最后一个单词之后的空格被删除;但其他人不是。此例程中缺少终端空格可能表示缺少终端引号。字段中未开始或结束的引号仅被视为字符串的一部分。如果没有开始,则不会从字段末尾删除引号。要在此处检测错误,只需检查以引号开头但不以引号结尾的字段;和/或包含引号但不以一个开头和结尾的一个,在最后一个循环中。
More than was needed for yer test case, I know; but a solid general answer to the ?, nonetheless - perhaps for others who have found it.
我知道,这超出了您的测试用例所需的范围;但是对于 ? 的一个可靠的一般答案,尽管如此 - 也许对于找到它的其他人来说。
Adapted from: https://github.com/hnaohiro/qt-csv/blob/master/csv.cpp
回答by iamantony
Try qtcsvlibrary for reading and writing csv-files. Example:
尝试使用qtcsv库来读取和写入 csv 文件。例子:
#include <QList>
#include <QStringList>
#include <QDir>
#include <QDebug>
#include "qtcsv/stringdata.h"
#include "qtcsv/reader.h"
#include "qtcsv/writer.h"
int main()
{
// prepare data that you want to save to csv-file
QStringList strList;
strList << "one" << "two" << "three";
QtCSV::StringData strData;
strData.addRow(strList);
strData.addEmptyRow();
strData << strList << "this is the last row";
// write to file
QString filePath = QDir::currentPath() + "/test.csv";
QtCSV::Writer::write(filePath, strData);
// read data from file
QList<QStringList> readData = QtCSV::Reader::readToList(filePath);
for ( int i = 0; i < readData.size(); ++i )
{
qDebug() << readData.at(i).join(",");
}
return 0;
}
I tried to make it small and easy-to-use. See Readmefile for library documentation and other code examples.
我试图让它变得小巧且易于使用。有关库文档和其他代码示例,请参阅自述文件。
回答by Dieu Linh
lines = data.split('\n');
then
然后
for line in lines
column1.add(line.split(',')[0])
I am not sure add function exists or not to add to an array - let call column 1
我不确定 add 函数是否存在或不添加到数组中 - 让我们调用第 1 列