使用 Weka Java 代码 - 如何将 CSV(无标题行)转换为 ARFF 格式?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3517186/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Using Weka Java Code - How Convert CSV (without header row) to ARFF Format?
提问by Greg
I'm using the Weka Java library to read in a CSV fileand convert it to an ARFF file.
我使用Weka的Java库,用来在一个CSV文件中读取,并把它转换成ARFF文件。
The problem is that the CSV file doesn't have a header row, only data. How do I assign attribute names after I bring in the CSV file? (all the columns would be string data types)
问题是CSV 文件没有标题行,只有数据。导入 CSV 文件后如何分配属性名称?(所有列都是字符串数据类型)
Here is the code I have so far:
这是我到目前为止的代码:
CSVLoader loader = new CSVLoader();
loader.setSource(new File(CSVFilePath));
Instances data = loader.getDataSet();
ArffSaver saver = new ArffSaver();
saver.setInstances(data);
saver.setFile(new File(outputFilePath));
saver.writeBatch();
I tried looking through the Weka source code to figure this out but I couldn't make heads or tails of it :-(
我试图查看 Weka 源代码来解决这个问题,但我无法理解它:-(
采纳答案by michaeltwofish
The short answer is, you can't assign attribute names afteryou read in the file.
简短的回答是,您无法在读入文件后分配属性名称。
CSVLoader assumes the first line of the CSV is the header. If that's an instance, it will use that instance data as the header row and not as instance data, which is definitely not what you want.
CSVLoader 假定 CSV 的第一行是标题。如果这是一个实例,它将使用该实例数据作为标题行而不是作为实例数据,这绝对不是您想要的。
Before the code above, you need to read the file in, write a header row, and save the file again.
在上面的代码之前,你需要读入文件,写一个标题行,然后再次保存文件。
回答by maledr53
You can use the option -H if you have no header row present in the data.
如果数据中没有标题行,则可以使用选项 -H。
CSVLoader loader = new CSVLoader();
loader.setSource(new File(CSVFilePath));
String[] options = new String[1];
options[0] = "-H";
loader.setOptions(options);
Instances data = loader.getDataSet();
see: http://weka.sourceforge.net/doc.dev/weka/core/converters/CSVLoader.html
见:http: //weka.sourceforge.net/doc.dev/weka/core/converters/CSVLoader.html
回答by maledr53
My solution:
我的解决方案:
SELECT 'nameColumn1','nameColumn2'
UNION
SELECT idColumn1,idColumn2
FROM path
INTO OUTFILE '/tmp/w.csv'
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n';
nameColumn1 and nameColumn2 are the column header that will appear as the first line of the csv file.
nameColumn1 和 nameColumn2 是列标题,将显示为 csv 文件的第一行。

