Java 8 Stream:groupingBy 与多个收集器

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/32071726/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-02 19:35:54  来源:igfitidea点击:

Java 8 Stream: groupingBy with multiple Collectors

javajava-8java-stream

提问by PhilippS

I want to use a Java 8 Stream and Group by one classifier but have multiple Collector functions. So when grouping, for example the average and the sum of one field (or maybe another field) is calculated.

我想使用 Java 8 Stream 并按一个分类器分组,但具有多个收集器功能。因此,在分组时,例如计算一个字段(或另一个字段)的平均值和总和。

I try to simplify this a bit with an example:

我试着用一个例子来简化一下:

public void test() {
    List<Person> persons = new ArrayList<>();
    persons.add(new Person("Person One", 1, 18));
    persons.add(new Person("Person Two", 1, 20));
    persons.add(new Person("Person Three", 1, 30));
    persons.add(new Person("Person Four", 2, 30));
    persons.add(new Person("Person Five", 2, 29));
    persons.add(new Person("Person Six", 3, 18));

    Map<Integer, Data> result = persons.stream().collect(
            groupingBy(person -> person.group, multiCollector)
    );
}

class Person {
    String name;
    int group;
    int age;

    // Contructor, getter and setter
}

class Data {
    long average;
    long sum;

    public Data(long average, long sum) {
        this.average = average;
        this.sum = sum;
    }

    // Getter and setter
}

The result should be a Map that associates the result of grouping like

结果应该是一个 Map 关联分组的结果,如

1 => Data(average(18, 20, 30), sum(18, 20, 30))
2 => Data(average(30, 29), sum(30, 29))
3 => ....

This works perfectly fine with one function like "Collectors.counting()" but I like to chain more than one (ideally infinite from a List).

这对于像“Collectors.counting()”这样的函数非常有效,但我喜欢链接多个(理想情况下从列表中是无限的)。

List<Collector<Person, ?, ?>>

Is it possible to do something like this?

有可能做这样的事情吗?

回答by Tagir Valeev

For the concrete problem of summing and averaging, use collectingAndThenalong with summarizingDouble:

对于求和平均的具体问题,请collectingAndThen连同summarizingDouble

Map<Integer, Data> result = persons.stream().collect(
        groupingBy(Person::getGroup, 
                collectingAndThen(summarizingDouble(Person::getAge), 
                        dss -> new Data((long)dss.getAverage(), (long)dss.getSum()))));

For the more generic problem (collect various things about your Persons), you can create a complex collector like this:

对于更通用的问题(收集有关 Person 的各种信息),您可以创建一个复杂的收集器,如下所示:

// Individual collectors are defined here
List<Collector<Person, ?, ?>> collectors = Arrays.asList(
        Collectors.averagingInt(Person::getAge),
        Collectors.summingInt(Person::getAge));

@SuppressWarnings("unchecked")
Collector<Person, List<Object>, List<Object>> complexCollector = Collector.of(
    () -> collectors.stream().map(Collector::supplier)
        .map(Supplier::get).collect(toList()),
    (list, e) -> IntStream.range(0, collectors.size()).forEach(
        i -> ((BiConsumer<Object, Person>) collectors.get(i).accumulator()).accept(list.get(i), e)),
    (l1, l2) -> {
        IntStream.range(0, collectors.size()).forEach(
            i -> l1.set(i, ((BinaryOperator<Object>) collectors.get(i).combiner()).apply(l1.get(i), l2.get(i))));
        return l1;
    },
    list -> {
        IntStream.range(0, collectors.size()).forEach(
            i -> list.set(i, ((Function<Object, Object>)collectors.get(i).finisher()).apply(list.get(i))));
        return list;
    });

Map<Integer, List<Object>> result = persons.stream().collect(
        groupingBy(Person::getGroup, complexCollector)); 

Map values are lists where first element is the result of applying the first collector and so on. You can add a custom finisher step using Collectors.collectingAndThen(complexCollector, list -> ...)to convert this list to something more appropriate.

映射值是列表,其中第一个元素是应用第一个收集器的结果,依此类推。您可以添加自定义整理器步骤,Collectors.collectingAndThen(complexCollector, list -> ...)用于将此列表转换为更合适的内容。

回答by WillShackleford

By using a map as an output type one could have a potentially infinite list of reducers each producing its own statistic and adding it to the map.

通过使用地图作为输出类型,可以有一个潜在的无限减速器列表,每个减速器都会生成自己的统计数据并将其添加到地图中。

public static <K, V> Map<K, V> addMap(Map<K, V> map, K k, V v) {
    Map<K, V> mapout = new HashMap<K, V>();
    mapout.putAll(map);
    mapout.put(k, v);
    return mapout;
}

...

...

    List<Person> persons = new ArrayList<>();
    persons.add(new Person("Person One", 1, 18));
    persons.add(new Person("Person Two", 1, 20));
    persons.add(new Person("Person Three", 1, 30));
    persons.add(new Person("Person Four", 2, 30));
    persons.add(new Person("Person Five", 2, 29));
    persons.add(new Person("Person Six", 3, 18));

    List<BiFunction<Map<String, Integer>, Person, Map<String, Integer>>> listOfReducers = new ArrayList<>();

    listOfReducers.add((m, p) -> addMap(m, "Count", Optional.ofNullable(m.get("Count")).orElse(0) + 1));
    listOfReducers.add((m, p) -> addMap(m, "Sum", Optional.ofNullable(m.get("Sum")).orElse(0) + p.i1));

    BiFunction<Map<String, Integer>, Person, Map<String, Integer>> applyList
            = (mapin, p) -> {
                Map<String, Integer> mapout = mapin;
                for (BiFunction<Map<String, Integer>, Person, Map<String, Integer>> f : listOfReducers) {
                    mapout = f.apply(mapout, p);
                }
                return mapout;
            };
    BinaryOperator<Map<String, Integer>> combineMaps
            = (map1, map2) -> {
                Map<String, Integer> mapout = new HashMap<>();
                mapout.putAll(map1);
                mapout.putAll(map2);
                return mapout;
            };
    Map<String, Integer> map
            = persons
            .stream()
            .reduce(new HashMap<String, Integer>(),
                    applyList, combineMaps);
    System.out.println("map = " + map);

Produces :

产生:

map = {Sum=10, Count=6}

回答by Peter Lawrey

You could chain them,

你可以把它们锁起来,

A collector can only produce one object, but this object can hold multiple values. You could return a Map for example where the map has an entry for each collector you are returning.

一个收集器只能产生一个对象,但这个对象可以保存多个值。例如,您可以返回一个地图,其中地图为您返回的每个收集器都有一个条目。

You can use Collectors.of(HashMap::new, accumulator, combiner);

您可以使用 Collectors.of(HashMap::new, accumulator, combiner);

Your accumulatorwould have a Map of Collectors where the keys of the Map produced matches the name of the Collector. Te combiner would need a way to combine multiple result esp when this is performed in parallel.

accumulator将拥有一个收藏家地图,其中生成的地图的键与收藏家的名称相匹配。当并行执行时,组合器需要一种组合多个结果的方法。



Generally the built in collectors use a data type for complex results.

通常,内置收集器使用一种数据类型来获取复杂的结果。

From Collectors

来自收藏家

public static <T>
Collector<T, ?, DoubleSummaryStatistics> summarizingDouble(ToDoubleFunction<? super T> mapper) {
    return new CollectorImpl<T, DoubleSummaryStatistics, DoubleSummaryStatistics>(
            DoubleSummaryStatistics::new,
            (r, t) -> r.accept(mapper.applyAsDouble(t)),
            (l, r) -> { l.combine(r); return l; }, CH_ID);
}

and in its own class

并在自己的班级

public class DoubleSummaryStatistics implements DoubleConsumer {
    private long count;
    private double sum;
    private double sumCompensation; // Low order bits of sum
    private double simpleSum; // Used to compute right sum for non-finite inputs
    private double min = Double.POSITIVE_INFINITY;
    private double max = Double.NEGATIVE_INFINITY;

回答by Marko Topolnik

Instead of chaining the collectors, you should build an abstraction which is an aggregator of collectors: implement the Collectorinterface with a class which accepts a list of collectors and delegates each method invocation to each of them. Then, in the end, you return new Data()with all the results the nested collectors produced.

不是链接收集器,您应该构建一个抽象,它是收集器的聚合器:Collector用一个类实现接口,该类接受收集器列表并将每个方法调用委托给它们中的每一个。然后,最后,您返回new Data()嵌套收集器产生的所有结果。

You can avoid creating a custom class with all the method declarations by making use of Collector.of(supplier, accumulator, combiner, finisher, Collector.Characteristics... characteristics)The finisherlambda will call the finisher of each nested collector, then return the Datainstance.

您可以避免通过利用创造了所有的方法声明的自定义类Collector.of(supplier, accumulator, combiner, finisher, Collector.Characteristics... characteristics)finisher拉姆达会调用每个嵌套收集器的终结者,然后返回Data实例。