如何使用 Java 中的特定字段对 CSV 文件中的数据进行排序?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/24744670/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to sort data in a CSV file using a particular field in Java?
提问by Srikanth Kandalam
I want to read a CSV file in Java and sort it using a particular column. My CSV file looks like this:
我想用 Java 读取 CSV 文件并使用特定列对其进行排序。我的 CSV 文件如下所示:
ABC,DEF,11,GHI....
JKL,MNO,10,PQR....
STU,VWX,12,XYZ....
Considering I want to sort it using the third column, my output should look like:
考虑到我想使用第三列对其进行排序,我的输出应如下所示:
JKL,MNO,10,PQR....
ABC,DEF,11,GHI....
STU,VWX,12,XYZ....
After some research on what data structure to use to hold the data of CSV, people here suggested to use Map data structure with Integer and List as key and value pairs in this question:
在研究了使用什么数据结构来保存CSV数据之后,这里的人建议在这个问题中使用以Integer和List作为键值对的Map数据结构:
Map<Integer, List<String>>
where the value, List<String> = {[ABC,DEF,11,GHI....], [JKL,MNO,10,PQR....],[STU,VWX,12,XYZ....]...}
And the key will be an auto-incremented integer starting from 0.
So could anyone please suggest a way to sort this Map using an element in the 'List' in Java? Also if you think this choice of data structure is bad, please feel free to suggest an easier data structure to do this.
那么有人可以建议一种使用 Java 中“列表”中的元素对这个 Map 进行排序的方法吗?此外,如果您认为这种选择的数据结构不好,请随时建议一个更简单的数据结构来做到这一点。
Thank you.
谢谢你。
采纳答案by AlexWien
I would use an ArrayList
of ArrayList
of String
:
我将使用ArrayList
的ArrayList
的String
:
ArrayList<ArrayList<String>>
Each entry is one line, which is a list of strings. You initialize the list by:
每个条目是一行,它是一个字符串列表。您可以通过以下方式初始化列表:
List<ArrayList<String>> csvLines = new ArrayList<ArrayList<String>>();
To get the nth line:
要获取第 n 行:
List<String> line = csvLines.get(n);
To sort you write a custom Comparator. In the Constructor of that comparator you can pass the field position used to sort.
排序你写一个自定义比较器。在该比较器的构造函数中,您可以传递用于排序的字段位置。
The compare method then gets the String value on stored position and converts it to a primitive ava type depending on the position. E.g you know that at position 2 in the csv there is an Integer, then convert the String to an int. This is neccessary for corretcly sorting. You may also pass an ArrayList of Class to the constructor such that it knows which field is what type.
Then use String.compareTo()
or Integer.compare()
, depending on column position etc.
然后比较方法获取存储位置上的字符串值,并根据位置将其转换为原始 ava 类型。例如,您知道在 csv 的位置 2 处有一个整数,然后将字符串转换为整数。这是正确排序所必需的。您还可以将 Class 的 ArrayList 传递给构造函数,以便它知道哪个字段是什么类型。
然后使用String.compareTo()
或Integer.compare()
,取决于列位置等。
Edit example of working code:
编辑工作代码示例:
List<ArrayList<String>> csvLines = new ArrayList<ArrayList<String>>();
Comparator<ArrayList<String>> comp = new Comparator<ArrayList<String>>() {
public int compare(ArrayList<String> csvLine1, ArrayList<String> csvLine2) {
// TODO here convert to Integer depending on field.
// example is for numeric field 2
return Integer.valueOf(csvLine1.get(2)).compareTo(Integer.valueOf(csvLine2.get(2)));
}
};
Collections.sort(csvLines, comp);
回答by ltalhouarne
You can also use a list of lists:
您还可以使用列表列表:
List<List<String>> Llp = new ArrayList<List<String>>();
Then you need to call sort that extends a custom comparator that compares the third item in the list:
然后您需要调用扩展自定义比较器的 sort 来比较列表中的第三项:
Collections.sort(Llp, new Comparator<LinkedList<String>>() {
@Override
public int compare(LinkedList<String> o1, LinkedList<String> o2) {
try {
return o1.get(2).compareTo(o2.get(2));
} catch (IndexOutOfBoundsException e) {
return 0;
}
}
回答by Peter Lawrey
In Java 8 you can do
在 Java 8 中你可以做
SortedMap<Integer, List<String>> collect = Files.lines(Paths.get(filename))
.collect(Collectors.groupingBy(
l -> Integer.valueOf(l.split(",", 4)[2]),
TreeMap::new, Collectors.toList()));
Note: comparing numbers are Strings is a bad idea as "100" < "2"
which might not be what you expect.
注意:比较数字是字符串是一个坏主意,因为"100" < "2"
这可能不是您所期望的。
I would use a sorted multi-map. If you don't have one handy you can do this.
我会使用排序的多地图。如果你手头没有,你可以这样做。
SortedMap<Integer, List<String>> linesByKey = new TreeMap<>();
public void addLine(String line) {
Integer key = Integer.valueOf(line.split(",", 4));
List<String> lines = linesByKey.get(key);
if (lines == null)
linesByKey.put(key, lines = new ArrayList<>());
lines.add(line);
}
This will produce a collection of lines, sorted by the number where lines with duplicate numbers have a preserved order. e.g. if all the lines have the same number, the order is unchanged.
这将生成一组行,按编号排序,其中具有重复编号的行保留顺序。例如,如果所有行的编号相同,则顺序不变。
回答by Anuj
In the below code I have sorted the CSV file based on the second column.
在下面的代码中,我根据第二列对 CSV 文件进行了排序。
public static void main(String[] args) throws IOException {
String csvFile = "file_1.csv";
String line = "";
String cvsSplitBy = ",";
List<List<String>> llp = new ArrayList<>();
try (BufferedReader br = new BufferedReader(new FileReader(csvFile))) {
while ((line = br.readLine()) != null) {
llp.add(Arrays.asList(line.split(cvsSplitBy)));
}
llp.sort(new Comparator<List<String>>() {
@Override
public int compare(List<String> o1, List<String> o2) {
return o1.get(1).compareTo(o2.get(1));
}
});
System.out.println(llp);
} catch (IOException e) {
e.printStackTrace();
}
}