java 将元素从 Stream 添加到现有列表的更好方法是什么?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/39495347/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-03 04:25:13  来源:igfitidea点击:

What's the better way to add elements from a Stream to an existing List?

javacollectionsjava-8java-stream

提问by ahelix

I have to write some code that adds the content of a Java 8 Stream to a List multiple times, and I'm having trouble figuring out what's the best way to do it. Based on what I read on SO (mainly this question: How to add elements of a Java8 stream into an existing List) and elsewhere, I've narrowed it down to the following options:

我必须编写一些代码,将 Java 8 Stream 的内容多次添加到 List 中,但我无法弄清楚什么是最好的方法。根据我在 SO(主要是这个问题:How to add elements of a Java8 stream into an existing List)和其他地方的内容,我将范围缩小到以下选项:

import java.util.ArrayList;
import java.util.List;
import java.util.function.Function;
import java.util.stream.Collectors;

public class Accumulator<S, T> {


    private final Function<S, T> transformation;
    private final List<T> internalList = new ArrayList<T>();

    public Accumulator(Function<S, T> transformation) {
        this.transformation = transformation;
    }

    public void option1(List<S> newBatch) {
        internalList.addAll(newBatch.stream().map(transformation).collect(Collectors.toList()));
    }

    public void option2(List<S> newBatch) {
        newBatch.stream().map(transformation).forEach(internalList::add);
    }
}

The idea is that the methods would be called multiple times for the same instance of Accumulator. The choice is between using an intermediate list and callingCollection.addAll()once outside of the stream or calling collection.add()from the stream for each element.

这个想法是为Accumulator. 选择是使用中间列表和Collection.addAll()在流外调用一次或collection.add()从流中为每个元素调用一次。

I would tend to prefer option 2 which is more in the spirit of functional programming, and avoid creating an intermediate list, however, there might be benefits to calling addAll()instead of calling add()n times when n is large.

我倾向于选择更符合函数式编程精神的选项 2,并避免创建中间列表,但是,当 n 很大时,调用addAll()而不是调用add()n 次可能有好处。

Is one of the two options significantly better than the other ?

两种选择中的一种是否明显优于另一种?

EDIT: JB Nizet has a very cool answerthat delays the transformation until all batches have been added. In my case, it is required that the transformation is performed straight-away.

编辑:JB Nizet 有一个非常酷的答案,可以延迟转换,直到添加所有批次。就我而言,需要立即执行转换。

PS: In my example code, I've used transformationas a placeholder for whatever operations which need to be performed on the stream

PS:在我的示例代码中,我用作transformation需要在流上执行的任何操作的占位符

回答by JB Nizet

The best solution would be a third one, avoiding that internal list completely. Just let the stream create the final list for you:

最好的解决方案是第三个,完全避免内部列表。只需让流为您创建最终列表:

Assuming you have a List<List<S>>, containing your N batches, on which the same transformation must be applied, you would do

假设您有一个List<List<S>>, 包含您的 N 个批次,必须对其应用相同的转换,您会这样做

List<T> result = 
    batches.stream()
           .flatMap(batch -> batch.stream())
           .map(transformation)
           .collect(Collectors.toList());

回答by Holger

First of all, your second variant should be:

首先,您的第二个变体应该是:

public void option2(List<S> newBatch) {
  newBatch.stream().map(transformation).forEachOrdered(internalList::add);
}

to be correct.

是正确的。

Besides that, the advantage of addAllin

除此之外,addAll

public void option1(List<S> newBatch) {
  internalList.addAll(newBatch.stream().map(transformation).collect(Collectors.toList()));
}

is moot as the CollectorAPI does not allow the Stream to provide hints about the expected size to the Collector and requires the Stream to evaluate the accumulator function for every element, which is nothing else than ArrayList::addin the current implementation.

没有实际意义,因为CollectorAPI 不允许 Stream 向收集器提供有关预期大小的提示,并且需要 Stream 评估每个元素的累加器函数,这只是ArrayList::add在当前实现中。

So before this approach could get any benefit from addAll, it filled an ArrayListby repeatedly calling addon an ArrayList, including potential capacity increase operations. So you can stay with option2without regret.

所以,在此之前的做法可以从中获取任何利益addAll,它填补了一个ArrayList通过反复调用addArrayList,包括潜在的容量增加操作。所以你可以option2毫无遗憾地留下来。

An alternative is to use a stream builder for temporary collections:

另一种方法是使用流构建器进行临时集合:

public class Accumulator<S, T> {
    private final Function<S, T> transformation;
    private final Stream.Builder<T> internal = Stream.builder();

    public Accumulator(Function<S, T> transformation) {
        this.transformation = transformation;
    }

    public void addBatch(List<S> newBatch) {
        newBatch.stream().map(transformation).forEachOrdered(internal);
    }

    public List<T> finish() {
        return internal.build().collect(Collectors.toList());
    }
}

The stream builder uses a spined buffer which does not require copying the contents when increasing its capacity, but the solution still suffers from the fact that the final collection step involves filling an ArrayListwithout an appropriate initial capacity (in the current implementation).

流构建器使用旋转缓冲区,它在增加容量时不需要复制内容,但解决方案仍然存在以下事实:最终收集步骤涉及填充一个ArrayList没有适当初始容量(在当前实现中)的事实。

With the current implementation, it's far more efficient to implement the finishing step as

使用当前的实现,实现完成步骤的效率要高得多

public List<T> finish() {
    return Arrays.asList(internal.build().toArray(…));
}

But this requires either, an IntFunction<T[]>provided by the caller (as we can't do that for a generic array type), or to perform an unchecked operation (pretending an Object[]to be T[], which would be ok here, but still a nasty unchecked operation).

但这需要IntFunction<T[]>调用者提供的一个(因为我们不能为泛型数组类型这样做),或者执行一个未经检查的操作(假装Object[]T[],在这里可以,但仍然是一个讨厌的未经检查的操作) .