Java 将 CSV 文件合并为一个没有重复标题的文件
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/18020364/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Merge CSV files into a single file with no repeated headers
提问by MxLDevs
I have some CSV files with the same column headers. For example
我有一些具有相同列标题的 CSV 文件。例如
File A
文件A
header1,header2,header3
one,two,three
four,five,six
File B
文件B
header1,header2,header3
seven,eight,nine
ten,eleven,twelve
I want to merge it so that the data is merged into one file with the headers at the top, but no headers anywhere else.
我想合并它,以便将数据合并到一个文件中,标题在顶部,但在其他任何地方都没有标题。
header1,header2,header3
one,two,three
four,five,six
seven,eight,nine
ten,eleven,twelve
What is a good way to achieve this?
实现这一目标的好方法是什么?
采纳答案by Ravi Thapliyal
This should work. It checks if the file being merged have matching headers. Would throw an exception otherwise. Exception handling (to close the streams etc.) has been left as an exercise.
这应该有效。它检查正在合并的文件是否具有匹配的标题。否则会抛出异常。异常处理(关闭流等)已留作练习。
String[] headers = null;
String firstFile = "/path/to/firstFile.dat";
Scanner scanner = new Scanner(new File(firstFile));
if (scanner.hasNextLine())
headers[] = scanner.nextLine().split(",");
scanner.close();
Iterator<File> iterFiles = listOfFilesToBeMerged.iterator();
BufferedWriter writer = new BufferedWriter(new FileWriter(firstFile, true));
while (iterFiles.hasNext()) {
File nextFile = iterFiles.next();
BufferedReader reader = new BufferedReader(new FileReader(nextFile));
String line = null;
String[] firstLine = null;
if ((line = reader.readLine()) != null)
firstLine = line.split(",");
if (!Arrays.equals (headers, firstLine))
throw new FileMergeException("Header mis-match between CSV files: '" +
firstFile + "' and '" + nextFile.getAbsolutePath());
while ((line = reader.readLine()) != null) {
writer.write(line);
writer.newLine();
}
reader.close();
}
writer.close();
回答by assylias
Here is an example:
下面是一个例子:
public static void main(String[] args) throws IOException {
List<Path> paths = Arrays.asList(Paths.get("c:/temp/file1.csv"), Paths.get("c:/temp/file2.csv"));
List<String> mergedLines = getMergedLines(paths);
Path target = Paths.get("c:/temp/merged.csv");
Files.write(target, mergedLines, Charset.forName("UTF-8"));
}
private static List<String> getMergedLines(List<Path> paths) throws IOException {
List<String> mergedLines = new ArrayList<> ();
for (Path p : paths){
List<String> lines = Files.readAllLines(p, Charset.forName("UTF-8"));
if (!lines.isEmpty()) {
if (mergedLines.isEmpty()) {
mergedLines.add(lines.get(0)); //add header only once
}
mergedLines.addAll(lines.subList(1, lines.size()));
}
}
return mergedLines;
}
回答by Conor O'Neill
It seems a bit heavyweight to do this in Java. Its trivial in a Linux shell:
在 Java 中执行此操作似乎有点重量级。它在 Linux shell 中是微不足道的:
(cat FileA ; tail --lines=+2 FileB) > FileC
回答by Sergio
Before:
前:
idFile#x_y.csv
idFile#x_y.csv
After:
后:
idFile.csv
id文件.csv
For example:
例如:
100#1_2.csv + 100#2_2.csv > 100.csv
100#1_2.csv + 100#2_2.csv > 100.csv
100#1_2.csv contains:
100#1_2.csv 包含:
"one","two","three"
"a","b","c"
"d","e","f"
100#2_2.csv contains:
100#2_2.csv 包含:
"one","two","three"
"g","h","i"
"j","k","l"
100.csv contains:
100.csv 包含:
"one","two","three"
"a","b","c"
"d","e","f"
"g","h","i"
"j","k","l"
Source:
来源:
//MergeDemo.java
import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;
import java.util.ArrayList;
//import java.util.Arrays;
import java.util.Iterator;
import java.util.Scanner;
public class MergeDemo {
public static void main(String[] args) {
String idFile = "100";
int numFiles = 3;
try {
mergeCsvFiles(idFile, numFiles);
} catch (IOException e) {
e.printStackTrace();
}
}
private static void mergeCsvFiles(String idFile, int numFiles) throws IOException {
// Variables
ArrayList<File> files = new ArrayList<File>();
Iterator<File> iterFiles;
File fileOutput;
BufferedWriter fileWriter;
BufferedReader fileReader;
String csvFile;
String csvFinal = "C:\out\" + idFile + ".csv";
String[] headers = null;
String header = null;
// Files: Input
for (int i = 1; i <= numFiles; i++) {
csvFile = "C:\in\" + idFile + "#" + i + "_" + numFiles + ".csv";
files.add(new File(csvFile));
}
// Files: Output
fileOutput = new File(csvFinal);
if (fileOutput.exists()) {
fileOutput.delete();
}
try {
fileOutput.createNewFile();
// log
// System.out.println("Output: " + fileOutput);
} catch (IOException e) {
// log
}
iterFiles = files.iterator();
fileWriter = new BufferedWriter(new FileWriter(csvFinal, true));
// Headers
Scanner scanner = new Scanner(files.get(0));
if (scanner.hasNextLine())
header = scanner.nextLine();
// if (scanner.hasNextLine()) headers = scanner.nextLine().split(";");
scanner.close();
/*
* System.out.println(header); for(String s: headers){
* fileWriter.write(s); System.out.println(s); }
*/
fileWriter.write(header);
fileWriter.newLine();
while (iterFiles.hasNext()) {
String line;// = null;
String[] firstLine;// = null;
File nextFile = iterFiles.next();
fileReader = new BufferedReader(new FileReader(nextFile));
if ((line = fileReader.readLine()) != null)
firstLine = line.split(";");
while ((line = fileReader.readLine()) != null) {
fileWriter.write(line);
fileWriter.newLine();
}
fileReader.close();
}
fileWriter.close();
}
}
回答by user3863921
Late here but Fuzzy-Csv (https://github.com/kayr/fuzzy-csv/) was designed just for that.
迟到了,但 Fuzzy-Csv ( https://github.com/kayr/fuzzy-csv/) 就是为此而设计的。
This is what the code would look like
这就是代码的样子
String csv1 = "NAME,SURNAME,AGE\n" +
"Fred,Krueger,Unknown";
String csv2 = "NAME,MIDDLENAME,SURNAME,AGE\n" +
"Jason,Noname,Scarry,16";
FuzzyCSVTable t1 = FuzzyCSVTable.parseCsv(csv1);
FuzzyCSVTable t2 = FuzzyCSVTable.parseCsv(csv2);
FuzzyCSVTable output = t1.mergeByColumn(t2);
output.printTable();
Output
输出
╔═══════╤═════════╤═════════╤════════════╗
║ NAME │ SURNAME │ AGE │ MIDDLENAME ║
╠═══════╪═════════╪═════════╪════════════╣
║ Fred │ Krueger │ Unknown │ - ║
╟───────┼─────────┼─────────┼────────────╢
║ Jason │ Scarry │ 16 │ Noname ║
╚═══════╧═════════╧═════════╧════════════╝
You can re-export your csv using one of the helper methods
您可以使用其中一种辅助方法重新导出您的 csv
output.write("FilePath.csv");
or
output.toCsvString()