从 Java 8 中的并行流中收集
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/44083445/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
collecting from parallel stream in java 8
提问by Vipul Goyal
I want to take an input and apply parallel stream on that, then I want output as list. Input could be any List or any collection on which we can apply streams.
我想接受一个输入并在其上应用并行流,然后我想输出为列表。输入可以是我们可以应用流的任何列表或任何集合。
My concerns here is that if we want output as map them we have an option from java is like
我在这里担心的是,如果我们想要输出作为映射它们,我们有一个来自 java 的选项就像
list.parallelStream().collect(Collectors.toConcurrentMap(args))
But there is no option that I can see to collect from parallel stream in thread safe way to provide list as output. I see one more option there to use
但是我无法看到以线程安全的方式从并行流中收集以提供列表作为输出的选项。我看到还有一个选项可以使用
list.parallelStream().collect(Collectors.toCollection(<Concurrent Implementation>))
list.parallelStream().collect(Collectors.toCollection(<Concurrent Implementation>))
in this way we can provide various concurrent implementations in collect method. But I think there is only CopyOnWriteArrayList List implementation is present in java.util.concurrent. We could use various queue implementation here but those will not be like list. What I mean here is that we can workaround to get the list.
通过这种方式,我们可以在 collect 方法中提供各种并发实现。但我认为 java.util.concurrent 中只有 CopyOnWriteArrayList List 实现。我们可以在这里使用各种队列实现,但那些不会像列表。我的意思是我们可以通过变通方法来获取列表。
Could you please guide me what is the best way if I want the output as list?
如果我想将输出作为列表,请指导我什么是最好的方法?
Note: I could not find any other post related to this, any reference would be helpful.
注意:我找不到与此相关的任何其他帖子,任何参考都会有所帮助。
回答by Andreas
The Collection
object used to receive the data being collected does not need to be concurrent. You can give it a simple ArrayList
.
Collection
用于接收正在收集的数据的对象不需要是并发的。你可以给它一个简单的ArrayList
.
That is because the collection of values from a parallel stream is not actually collected into a single Collection
object. Each thread will collect their own data, and then all sub-results will be mergedinto a single final Collection
object.
这是因为来自并行流的值的集合实际上并未收集到单个Collection
对象中。每个线程将收集自己的数据,然后所有子结果将合并为一个最终Collection
对象。
This is all well-documented in the Collector
javadoc, and the Collector
is the parameter you're giving to the collect()
method:
这在Collector
javadoc 中都有详细记录,并且Collector
是您提供给该collect()
方法的参数:
<R,A> R collect(Collector<? super T,A,R> collector)
回答by Eugene
But there is no option that I can see to collect from parallel stream in thread safe way to provide list as output
. This is entirely wrong.
But there is no option that I can see to collect from parallel stream in thread safe way to provide list as output
. 这是完全错误的。
The whole point in streams is that you can use a non-thread safe Collection to achieve perfectly valid thread-safe results. This is because of how streams are implemented (and this was a key part of the design of streams). You could see that a Collector
defines a method supplier
that at each step will create a new instance. Those instances will be merged between them.
流的全部意义在于您可以使用非线程安全的 Collection 来实现完全有效的线程安全结果。这是因为流是如何实现的(这是流设计的关键部分)。您可以看到 aCollector
定义了一个方法supplier
,该方法在每一步都会创建一个新实例。这些实例将在它们之间合并。
So this is perfectly thread safe:
所以这是完全线程安全的:
Stream.of(1,2,3,4).parallel()
.collect(Collectors.toList());
Since there are 4 elements in this stream, there will be 4 instances of ArrayList
created that will be merged at the end to a single result (assuming at least 4 CPU cores)
由于此流中有 4 个元素,因此将有 4 个ArrayList
created实例在最后合并为一个结果(假设至少有 4 个 CPU 内核)
On the other side methods like toConcurrent
generate a single result containerand all threads will put their result into it.
另一方面,像toConcurrent
生成单个结果容器这样的方法,所有线程都会将它们的结果放入其中。