在 Java 中“分组依据”和聚合值的最佳数据结构?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/30028070/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-02 16:19:59  来源:igfitidea点击:

Best data structure to "group by" and aggregate values in Java?

javaarraylistdata-structuresgroup-by

提问by isaacniu

I created an ArrayList of Array type like below,

我创建了一个如下所示的 Array 类型的 ArrayList,

ArrayList<Object[]> csvArray = new ArrayList<Object[]>();

As you can see, each element of the ArrayList is an array like {Country, City, Name, Age}.

如您所见,ArrayList 的每个元素都是一个数组,如 {Country, City, Name, Age}。

Now I'm wanting to do a "group by" on Countryand City(combined), followed by taking the average Ageof the people for each Country+City.

现在我想对国家城市(组合)进行“分组” ,然后取每个国家+城市的平均年龄

May I know what is the easiest way to achieve this? Or you guys have suggestions to use data structures better than ArrayList for this "group by" and aggregation requirements?

我可以知道实现这一目标的最简单方法是什么吗?或者你们有建议使用比 ArrayList 更好的数据结构来满足这个“分组依据”和聚合要求?

Your answers are much appreciated.

非常感谢您的回答。

回答by Shineed Basheer

You will get lot of options in Java 8.

您将在 Java 8 中获得很多选项。

Example

例子

 Stream<Person> people = Stream.of(new Person("Paul", 24), new Person("Mark",30), new Person("Will", 28));
 Map<Integer, List<String>> peopleByAge = people
.collect(groupingBy(p -> p.age, mapping((Person p) -> p.name, toList())));
 System.out.println(peopleByAge);

If you can use Java 8 and no specific reason for using a data structure, you can go through below tutorial

如果你可以使用 Java 8 并且没有使用数据结构的具体原因,你可以通过下面的教程

http://java.dzone.com/articles/java-8-group-collections

http://java.dzone.com/articles/java-8-group-collections

回答by Manu

You can check the collections recommended by @duffy356. I can give you an standardsolution related with java.utils

您可以查看@duffy356 推荐的系列。我可以给你一个标准的解决方案java.utils

I'd use a common Map<Key,Value>and being specific a HashMap.
For the keys, as I can see, you'll need and extra plain object which relates country and city. The point is create a working equals(Object) : booleanmethod. I'd use the Eclipse-auto generator; for me it gives me the following:

我会使用一个通用的Map<Key,Value>和特定的HashMap.
对于键,正如我所见,您需要一个与国家和城市相关的额外普通对象。重点是创建一个工作equals(Object) : boolean方法。我会使用 Eclipse 自动生成器;对我来说,它给了我以下内容:

class CountryCityKey {
 // package visibility
 String country;
 String city;

@Override
public int hashCode() {
  final int prime = 31;
  int result = 1;
  result = prime * result + ((country == null) ? 0 : country.hashCode());
  result = prime * result + ((region == null) ? 0 : region.hashCode());
  return result;
}

@Override
public boolean equals(Object obj) {
  if (this == obj)
    return true;
  if (obj == null)
    return false;
  if (getClass() != obj.getClass())
    return false;
  CountryCityKey other = (CountryCityKey) obj;
  if (country == null) {
    if (other.country != null)
      return false;
  } else if (!country.equals(other.country))
    return false;
  if (region == null) {
    if (other.region != null)
      return false;
  } else if (!region.equals(other.region))
    return false;
  return true;
}

}

}



Now we can group or objects in a HashMap<CountryCityKey, MySuperObject>

现在我们可以在一个 HashMap<CountryCityKey, MySuperObject>

The code for that could be:

代码可能是:

Map<CountryCityKey, List<MySuperObject>> group(List<MySu0perObject> list) {
  Map<CountryCityKey, MySuperObject> response = new HashMap<>(list.size());  
  for (MySuperObject o : list) {
     CountryCityKey key = o.getKey(); // I consider this done, so simply
     List<MySuperObject> l;
     if (response.containsKey(key)) {
        l = response.get(key);
     } else {
        l = new ArrayList<MySuperObject>();
     }
     l.add(o);
     response.put(key, l);
  }
  return response;
}

And you have it :)

你有它:)

回答by Jesper

You could use Java 8 streams for this and Collectors.groupingBy. For example:

您可以为此使用 Java 8 流和Collectors.groupingBy. 例如:

final List<Object[]> data = new ArrayList<>();
data.add(new Object[]{"NL", "Rotterdam", "Kees", 38});
data.add(new Object[]{"NL", "Rotterdam", "Peter", 54});
data.add(new Object[]{"NL", "Amsterdam", "Suzanne", 51});
data.add(new Object[]{"NL", "Rotterdam", "Tom", 17});

final Map<String, List<Object[]>> map = data.stream().collect(
        Collectors.groupingBy(row -> row[0].toString() + ":" + row[1].toString()));

for (final Map.Entry<String, List<Object[]>> entry : map.entrySet()) {
    final double average = entry.getValue().stream()
                                .mapToInt(row -> (int) row[3]).average().getAsDouble();
    System.out.println("Average age for " + entry.getKey() + " is " + average);
}

回答by swinkler

I would recommend an additional step. You gather your data from CSV in Object[]. If you wrap your data into a class containing these data java8 collections will easily help you. (also without but it is more readable and understandable)

我会推荐一个额外的步骤。您从 Object[] 中的 CSV 收集数据。如果您将数据包装到一个包含这些数据的类中,java8 集合将很容易为您提供帮助。(也没有,但它更具可读性和可理解性)

Here is an example - it introduces a class Informationwhich contains your given data (country, city,name, age). The class has a constructor initializing these fields by a given Object[]array which might help you to do so - BUT: the fields have to be fixed(which is usual for CSV):

这是一个示例 - 它引入了一个Information包含给定数据(国家、城市、姓名、年龄)的类。该类有一个构造函数,通过给定的Object[]数组初始化这些字段,这可能会帮助您这样做 - 但是:必须修复这些字段(这对于 CSV 来说很常见):

import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;

public class CSVExample {

  public static void main(String[] args) {
    ArrayList<Information> csvArray = new ArrayList<>();

    csvArray.add(new Information(new Object[] {"France", "Paris", "Pierre", 34}));
    csvArray.add(new Information(new Object[] {"France", "Paris", "Madeleine", 26}));
    csvArray.add(new Information(new Object[] {"France", "Toulouse", "Sam", 34}));
    csvArray.add(new Information(new Object[] {"Italy", "Rom", "Paul", 44}));

// combining country and city with whitespace delimiter to use it as the map key
    Map<String, List<Information>> collect = csvArray.stream().collect(Collectors.groupingBy(s -> (s.getCountry() + " " + s.getCity())));
//for each key (country and city) print the key and the average age
    collect.forEach((k, v) -> System.out.println(k + " " + v.stream().collect(Collectors.averagingInt(Information::getAge))));
  }
}

class Information {
  private String country;
  private String city;
  private String name;
  private int age;

  public Information(Object[] information) {
    this.country = (String) information[0];
    this.city = (String) information[1];
    this.name = (String) information[2];
    this.age = (Integer) information[3];

  }

  public Information(String country, String city, String name, int age) {
    super();
    this.country = country;
    this.city = city;
    this.name = name;
    this.age = age;
  }

  public String getCountry() {
    return country;
  }

  public String getCity() {
    return city;
  }

  public String getName() {
    return name;
  }

  public int getAge() {
    return age;
  }

  @Override
  public String toString() {
    return "Information [country=" + country + ", city=" + city + ", name=" + name + ", age=" + age + "]";
  }

}

The main shows a simple output for your question.

main 为您的问题显示了一个简单的输出。

回答by MChaker

In java 8 the idea of grouping objects in a collection based on the values of one or more of their properties is simplified by using a Collector.

在 java 8 中,通过使用收集器简化了基于一个或多个属性的值对集合中的对象进行分组的想法。

First, I suggest you add a new class as follow

首先,我建议你添加一个新类如下

class Info {

    private String country;
    private String city;
    private String name;
    private int age;

    public Info(String country,String city,String name,int age){
        this.country=country;
        this.city=city;
        this.name=name;
        this.age=age;
    }

    public String toString() {
         return "("+country+","+city+","+name+","+age+")";
    }

   // getters and setters       

}

Setting up infos

配置 infos

   ArrayList<Info> infos  =new  ArrayList();


   infos.add(new Info("USA", "Florida", "John", 26));
   infos.add(new Info("USA", "Florida", "James", 18));
   infos.add(new Info("USA", "California", "Alan", 30));

Group by Country+City:

按国家+城市分组:

  Map<String, Map<String, List<Info>>> 
           groupByCountryAndCity = infos.
             stream().
               collect(
                    Collectors.
                        groupingBy(
                            Info::getCountry,
                            Collectors.
                                groupingBy(
                                     Info::getCity     
                                          )
                                   )
                     );


    System.out.println(groupByCountryAndCity.get("USA").get("California"));

Output

输出

[(USA,California,James,18), (USA,California,Alan,30)]

The average Age of the people for each Country+City:

每个国家+城市的人口平均年龄:

    Map<String, Map<String, Double>> 
    averageAgeByCountryAndCity = infos.
         stream().
           collect(
             Collectors.
                 groupingBy(
                    Info::getCountry,
                     Collectors.
                         groupingBy(
                             Info::getCity,
                             Collectors.averagingDouble(Info::getAge)
                                   )
                            )
              );

     System.out.println(averageAgeByCountryAndCity.get("USA").get("Florida"));

Output:

输出:

22.0

回答by Shrini Jaiswal

/* category , list of cars*/

Please use the below code : I have pasted it from my sample app !Happy Coding .

请使用以下代码:我从我的示例应用程序中粘贴了它!Happy Coding。

                            Map<String, List<JmCarDistance>> map = new HashMap<String, List<JmCarDistance>>();

                            for (JmCarDistance jmCarDistance : carDistanceArrayList) {
                                String key  = jmCarDistance.cartype;
                                if(map.containsKey(key)){
                                    List<JmCarDistance> list = map.get(key);
                                    list.add(jmCarDistance);

                                }else{
                                    List<JmCarDistance> list = new ArrayList<JmCarDistance>();
                                    list.add(jmCarDistance);
                                    map.put(key, list);
                                }

                            }

回答by duffy356

you could use the brownies-collections library of magicwerk.org (http://www.magicwerk.org/page-collections-overview.html)

您可以使用magicwerk.org 的brownies-collections 库(http://www.magicwerk.org/page-collections-overview.html

they offer keylists, which fit your requirements.(http://www.magicwerk.org/page-collections-examples.html)

他们提供符合您要求的密钥列表。(http://www.magicwerk.org/page-collections-examples.html