java parallelStream 与 stream.parallel
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/43811182/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
parallelStream vs stream.parallel
提问by Nick Clark
I have been curious about the difference between Collections.parallelStream()
and Collections.stream().parallel()
. According to the Javadocs, parallelStream()
tries to return a parallel stream, whereas stream().parallel()
returns a parallel stream. Through some testing of my own, I have found no differences. Where does the difference in these two methods lie? Is one implementation more time efficient than another? Thanks.
我一直很好奇Collections.parallelStream()
和之间的区别Collections.stream().parallel()
。根据 Javadocs,parallelStream()
尝试返回并行流,而stream().parallel()
返回并行流。通过我自己的一些测试,我没有发现任何差异。这两种方法的区别在哪里?一种实现是否比另一种实现更省时?谢谢。
采纳答案by Eugene
Even if they act the same at the moment, there is a difference - at least in their documentation, as you correctly pointed out; that might be exploited in the future as far as I can tell.
即使他们此刻的行为相同,也存在差异 - 至少在他们的文档中,正如您正确指出的那样;据我所知,这可能在未来被利用。
At the moment the parallelStream
method is defined in the Collection
interface as:
目前该parallelStream
方法在Collection
接口中定义为:
default Stream<E> parallelStream() {
return StreamSupport.stream(spliterator(), true);
}
Being a default method it could be overridden in implementations (and that's what Collections
inner classes actually do).
作为默认方法,它可以在实现中被覆盖(这就是Collections
内部类实际所做的)。
That hints that even if the default method returns a parallel Stream, there could be Collections that override this method to return a non-parallel Stream
. That is the reason the documentation is probably the way it is.
这暗示即使默认方法返回并行流,也可能存在覆盖此方法以返回non-parallel Stream
. 这就是文档可能是这样的原因。
At the same time evenif parallelStream
returns a sequential stream - it is still a Stream
, and then you could easily call parallel
on it:
同时,即使当parallelStream
收益连续流-它仍然是一个Stream
,然后你可以轻松地调用parallel
就可以了:
Collections.some()
.parallelStream() // actually sequential
.parallel() // force it to be parallel
At least for me, this looks weird.
至少对我来说,这看起来很奇怪。
It seems that the documentation should somehow state that after calling parallelStream
there should be no reason to call parallel
again to force that - since it might be useless or even bad for the processing.
似乎文档应该以某种方式说明在调用parallelStream
之后应该没有理由parallel
再次调用来强制这样做 - 因为它可能对处理无用甚至有害。
EDIT
编辑
For anyone reading this - please read the comments by Holger also; it covers cases beyond what I said in this answer.
对于阅读本文的任何人 - 请同时阅读 Holger 的评论;它涵盖了我在这个答案中所说的以外的情况。
回答by Joe C
There is no difference between Collections.parallelStream()
and Collections.stream().parallel()
. They will both divide the stream to the extent that the underlying spliterator will allow, and they will both run using the default ForkJoinPool (unless already running inside another one).
Collections.parallelStream()
和之间没有区别Collections.stream().parallel()
。它们都将在底层拆分器允许的范围内划分流,并且它们都将使用默认的 ForkJoinPool 运行(除非已经在另一个内部运行)。
回答by Sagar Gangwal
class Employee {
String name;
int salary;
public int getSalary() {
return salary;
}
public void setSalary(int salary) {
this.salary = salary;
}
public Employee(String name, int salary) {
this.name = name;
this.salary = salary;
}
}
class ParallelStream {
public static void main(String[] args) {
long t1, t2;
List<Employee> eList = new ArrayList<>();
for (int i = 0; i < 100; i++) {
eList.add(new Employee("A", 20000));
eList.add(new Employee("B", 3000));
eList.add(new Employee("C", 15002));
eList.add(new Employee("D", 7856));
eList.add(new Employee("E", 200));
eList.add(new Employee("F", 50000));
}
/***** Here We Are Creating A 'Sequential Stream' & Displaying The Result *****/
t1 = System.currentTimeMillis();
System.out.println("Sequential Stream Count?= " + eList.stream().filter(e -> e.getSalary() > 15000).count());
t2 = System.currentTimeMillis();
System.out.println("Sequential Stream Time Taken?= " + (t2 - t1) + "\n");
/***** Here We Are Creating A 'Parallel Stream' & Displaying The Result *****/
t1 = System.currentTimeMillis();
System.out.println("Parallel Stream Count?= " + eList.parallelStream().filter(e -> e.getSalary() > 15000).count());
t2 = System.currentTimeMillis();
System.out.println("Parallel Stream Time Taken?= " + (t2 - t1));
/***** Here We Are Creating A 'Parallel Stream with Collection.stream.parallel' & Displaying The Result *****/
t1 = System.currentTimeMillis();
System.out.println("stream().parallel() Count?= " + eList.stream().parallel().filter(e -> e.getSalary() > 15000).count());
t2 = System.currentTimeMillis();
System.out.println("stream().parallel() Time Taken?= " + (t2 - t1));
}
}
I had tried with all three ways .stream(),.parallelStream() and .stream().parallel().
with same number of records and able to identify timing taken by all three approach.
我已经尝试了所有三种方法.stream(),.parallelStream() and .stream().parallel().
,记录数量相同,并且能够确定所有三种方法所采用的时间。
Here i had mentioned O/P of same.
在这里我提到了相同的 O/P。
Sequential Stream Count?= 300
Sequential Stream Time Taken?= 18
Parallel Stream Count?= 300
Parallel Stream Time Taken?= 6
stream().parallel() Count?= 300
stream().parallel() Time Taken?= 1
I am not sure,but as mentioned in O/P time taken by stream().parallel()
is 1/6th of parallelStream()
.
我不确定,但正如在 O/P 中提到的时间stream().parallel()
是parallelStream()
.
Still any experts suggestions are mostly welcome.
仍然欢迎任何专家的建议。