在 Java 8 中对具有聚合的多个字段进行分组

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/32531517/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-02 20:19:43  来源:igfitidea点击:

Group by in Java 8 on multiple fields with aggregations

javajava-8

提问by djhworld

I have a list of domain objects that relate to web access records. These domain objects can stretch into the thousands in number.

我有一个与 Web 访问记录相关的域对象列表。这些域对象的数量可以扩展到数千个。

I don't have the resources or requirement to store them in a database in raw format, so instead I want to precompute aggregations and put the aggregated data in a database.

我没有资源或要求以原始格式将它们存储在数据库中,因此我想预先计算聚合并将聚合数据放入数据库中。

I need to aggregate the total bytes transferred in 5 minute windows, likethe following SQL query

我需要汇总 5 分钟窗口内传输的总字节数,下面的 SQL 查询

select 
  round(request_timestamp, '5') as window, --round timestamp to the nearest 5 minute
  cdn, 
  isp, 
  http_result_code, 
  transaction_time, 
  sum(bytes_transferred)
from web_records
group by 
    round(request_timestamp, '5'), 
    cdn, 
    isp, 
    http_result_code, 
    transaction_time

In Java 8 my first current stab looks like this, I am aware this solution is similar to this responsein Group by multiple field names in java 8

在Java 8成我的第一电流刺这个样子的,我知道这个解决方案与此类似反应集团的Java中的多个字段名8

Map<Date, Map<String, Map<String, Map<String, Map<String, Integer>>>>>>> aggregatedData =
webRecords
    .stream()
    .collect(Collectors.groupingBy(WebRecord::getFiveMinuteWindow,
               Collectors.groupingBy(WebRecord::getCdn,
                 Collectors.groupingBy(WebRecord::getIsp,
                   Collectors.groupingBy(WebRecord::getResultCode,
                       Collectors.groupingBy(WebRecord::getTxnTime,
                         Collectors.reducing(0,
                                             WebRecord::getReqBytes(),
                                             Integer::sum)))))));

This works, but it's ugly, all those nested maps are a nightmare! To "flatten" or "unroll" the map out into rows I have to do this

这有效,但它很丑陋,所有那些嵌套的地图都是一场噩梦!要将地图“展平”或“展开”成行,我必须这样做

for (Date window : aggregatedData.keySet()) {
  for (String cdn : aggregatedData.get(window).keySet()) {
    for (String isp : aggregatedData.get(window).get(cdn).keySet()) {
      for (String resultCode : aggregatedData.get(window).get(cdn).get(isp).keySet()) {
        for (String txnTime : aggregatedData.get(window).get(cdn).get(isp).get(resultCode).keySet()) {

           Integer bytesTransferred = aggregatedData.get(window).get(cdn).get(distId).get(isp).get(resultCode).get(txnTime);
           AggregatedRow row = new AggregatedRow(window, cdn, distId...

As you can see this is pretty messy and difficult to maintain.

如您所见,这非常混乱且难以维护。

Anyone have any ideas of a better way to do this? Any help would be greatly appreciated.

任何人都有更好的方法来做到这一点的任何想法?任何帮助将不胜感激。

I'm wondering if there is a nicer way to unroll the nested maps, or if there is a library that allows you to do a GROUP BY on a collection.

我想知道是否有更好的方法来展开嵌套映射,或者是否有一个库允许您对集合执行 GROUP BY。

回答by Tagir Valeev

You should create the custom key for your map. The simplest way is to use Arrays.asList:

您应该为地图创建自定义键。最简单的方法是使用Arrays.asList

Function<WebRecord, List<Object>> keyExtractor = wr ->
    Arrays.<Object>asList(wr.getFiveMinuteWindow(), wr.getCdn(), wr.getIsp(),
             wr.getResultCode(), wr.getTxnTime());
Map<List<Object>, Integer> aggregatedData = webRecords.stream().collect(
      Collectors.groupingBy(keyExtractor, Collectors.summingInt(WebRecord::getReqBytes)));

In this case the keys are lists of 5 elements in fixed order. Not quite object-oriented, but simple. Alternatively you can define your own type which represents the custom key and create proper hashCode/equalsimplementations.

在这种情况下,键是按固定顺序包含 5 个元素的列表。不太面向对象,但很简单。或者,您可以定义自己的类型来表示自定义键并创建正确的hashCode/equals实现。