为什么在java 8中转换类型的reduce方法需要组合器
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/24308146/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Why is a combiner needed for reduce method that converts type in java 8
提问by Louise Miller
I'm having trouble fully understanding the role that the combiner
fulfils in Streams reduce
method.
我无法完全理解combiner
Streamsreduce
方法中实现的角色。
For example, the following code doesnt compile :
例如,以下代码无法编译:
int length = asList("str1", "str2").stream()
.reduce(0, (accumulatedInt, str) -> accumulatedInt + str.length());
Compile error says : (argument mismatch; int cannot be converted to java.lang.String)
编译错误说:( 参数不匹配;int无法转换为java.lang.String)
but this code does compile :
但这段代码确实可以编译:
int length = asList("str1", "str2").stream()
.reduce(0, (accumulatedInt, str ) -> accumulatedInt + str.length(),
(accumulatedInt, accumulatedInt2) -> accumulatedInt + accumulatedInt2);
I understand that the combiner method is used in parallel streams - so in my example it is adding together two intermediate accumulated ints.
我知道在并行流中使用了组合器方法 - 所以在我的例子中,它将两个中间累积整数加在一起。
But I dont understand why the first example doesnt compile without the combiner or how the combiner is solving the conversion of string to int since it is just adding together two ints.
但是我不明白为什么第一个示例在没有组合器的情况下无法编译,或者组合器如何解决字符串到 int 的转换,因为它只是将两个 int 相加。
Can anyone shed light on this?
任何人都可以阐明这一点吗?
采纳答案by Eran
The two and three argument versions of reduce
which you tried to use don't accept the same type for the accumulator
.
reduce
您尝试使用的两个和三个参数版本不接受accumulator
.
The two argument reduce
is defined as:
两个参数reduce
被定义为:
T reduce(T identity,
BinaryOperator<T> accumulator)
In your case, T is String, so BinaryOperator<T>
should accept two String arguments and return a String. But you pass to it an int and a String, which results in the compilation error you got - argument mismatch; int cannot be converted to java.lang.String
. Actually, I think passing 0 as the identity value is also wrong here, since a String is expected (T).
在您的情况下,T 是字符串,因此BinaryOperator<T>
应该接受两个字符串参数并返回一个字符串。但是你传递给它一个 int 和一个 String,这会导致你得到的编译错误 - argument mismatch; int cannot be converted to java.lang.String
。实际上,我认为在这里将 0 作为标识值传递也是错误的,因为需要字符串 (T)。
Also note that this version of reduce processes a stream of Ts and returns a T, so you can't use it to reduce a stream of String to an int.
另请注意,此版本的reduce 处理Ts 流并返回T,因此您不能使用它来将String 流减少为int。
The three argument reduce
is defined as:
这三个参数reduce
被定义为:
<U> U reduce(U identity,
BiFunction<U,? super T,U> accumulator,
BinaryOperator<U> combiner)
In your case U is Integer and T is String, so this method will reduce a stream of String to an Integer.
在您的情况下,U 是整数,T 是字符串,因此此方法会将字符串流减少为整数。
For the BiFunction<U,? super T,U>
accumulator you can pass parameters of two different types (U and ? super T), which in your case are Integer and String. In addition, the identity value U accepts an Integer in your case, so passing it 0 is fine.
对于BiFunction<U,? super T,U>
累加器,您可以传递两种不同类型(U 和 ? super T)的参数,在您的情况下是整数和字符串。此外,在您的情况下,标识值 U 接受一个整数,因此将其传递给 0 就可以了。
Another way to achieve what you want :
实现您想要的另一种方法:
int length = asList("str1", "str2").stream().mapToInt (s -> s.length())
.reduce(0, (accumulatedInt, len) -> accumulatedInt + len);
Here the type of the stream matches the return type of reduce
, so you can use the two parameter version of reduce
.
这里流的类型与 的返回类型匹配reduce
,因此您可以使用 的两个参数版本reduce
。
Of course you don't have to use reduce
at all :
当然,您根本不必使用reduce
:
int length = asList("str1", "str2").stream().mapToInt (s -> s.length())
.sum();
回答by Stuart Marks
Eran's answerdescribed the differences between the two-arg and three-arg versions of reduce
in that the former reduces Stream<T>
to T
whereas the latter reduces Stream<T>
to U
. However, it didn't actually explain the need for the additional combiner function when reducing Stream<T>
to U
.
Eran 的回答描述了两个 arg 和三个 arg 版本之间的差异reduce
,因为前者减少Stream<T>
到T
而后者减少Stream<T>
到U
。然而,它实际上并没有解释在减少Stream<T>
到U
.
One of the design principles of the Streams API is that the API shouldn't differ between sequential and parallel streams, or put another way, a particular API shouldn't prevent a stream from running correctly either sequentially or in parallel. If your lambdas have the right properties (associative, non-interfering, etc.) a stream run sequentially or in parallel should give the same results.
Streams API 的设计原则之一是,该 API 不应在顺序流和并行流之间有所不同,或者换句话说,特定 API 不应阻止流按顺序或并行正确运行。如果您的 lambda 表达式具有正确的属性(关联性、非干扰性等),则顺序或并行运行的流应该会给出相同的结果。
Let's first consider the two-arg version of reduction:
让我们首先考虑减少的两个参数版本:
T reduce(I, (T, T) -> T)
The sequential implementation is straightforward. The identity value I
is "accumulated" with the zeroth stream element to give a result. This result is accumulated with the first stream element to give another result, which in turn is accumulated with the second stream element, and so forth. After the last element is accumulated, the final result is returned.
顺序实现很简单。标识值I
与第零个流元素“累加”以给出结果。该结果与第一个流元素累加以给出另一个结果,该结果又与第二个流元素累加,依此类推。最后一个元素累加后,返回最终结果。
The parallel implementation starts off by splitting the stream into segments. Each segment is processed by its own thread in the sequential fashion I described above. Now, if we have N threads, we have N intermediate results. These need to be reduced down to one result. Since each intermediate result is of type T, and we have several, we can use the same accumulator function to reduce those N intermediate results down to a single result.
并行实现首先将流拆分为段。每个段由它自己的线程以我上面描述的顺序方式处理。现在,如果我们有 N 个线程,我们就有 N 个中间结果。这些需要减少到一个结果。由于每个中间结果都是 T 类型,而且我们有多个,我们可以使用相同的累加器函数将这 N 个中间结果减少为单个结果。
Now let's consider a hypothetical two-arg reduction operation that reduces Stream<T>
to U
. In other languages, this is called a "fold"or "fold-left" operation so that's what I'll call it here. Note this doesn't exist in Java.
现在让我们考虑一个假设的两参数归约运算,它归约Stream<T>
到U
。在其他语言中,这被称为“折叠”或“向左折叠”操作,因此我将在这里称之为。请注意,这在 Java 中不存在。
U foldLeft(I, (U, T) -> U)
(Note that the identity value I
is of type U.)
(请注意,标识值为I
U 类型。)
The sequential version of foldLeft
is just like the sequential version of reduce
except that the intermediate values are of type U instead of type T. But it's otherwise the same. (A hypothetical foldRight
operation would be similar except that the operations would be performed right-to-left instead of left-to-right.)
的顺序版本foldLeft
就像 的顺序版本一样,reduce
只是中间值是 U 类型而不是 T 类型。但其他方面是相同的。(假设foldRight
操作与此类似,只是操作将从右到左而不是从左到右执行。)
Now consider the parallel version of foldLeft
. Let's start off by splitting the stream into segments. We can then have each of the N threads reduce the T values in its segment into N intermediate values of type U. Now what? How do we get from N values of type U down to a single result of type U?
现在考虑foldLeft
. 让我们首先将流拆分为多个段。然后我们可以让 N 个线程中的每个线程将其段中的 T 值减少为 N 个类型 U 的中间值。现在呢?我们如何从 U 类型的 N 个值到 U 类型的单个结果?
What's missing is another function that combinesthe multiple intermediate results of type U into a single result of type U. If we have a function that combines two U values into one, that's sufficient to reduce any number of values down to one -- just like the original reduction above. Thus, the reduction operation that gives a result of a different type needs two functions:
缺少的是另一个函数,它将多个 U 类型的中间结果组合成一个 U 类型的结果。如果我们有一个函数将两个 U 值组合成一个,这足以将任意数量的值减少到一个——就像上面的原始减少。因此,给出不同类型结果的归约运算需要两个函数:
U reduce(I, (U, T) -> U, (U, U) -> U)
Or, using Java syntax:
或者,使用 Java 语法:
<U> U reduce(U identity, BiFunction<U,? super T,U> accumulator, BinaryOperator<U> combiner)
In summary, to do parallel reduction to a different result type, we need two functions: one that accumulatesT elements to intermediate U values, and a second that combinesthe intermediate U values into a single U result. If we aren't switching types, it turns out that the accumulator function is the same as the combiner function. That's why reduction to the same type has only the accumulator function and reduction to a different type requires separate accumulator and combiner functions.
总之,要对不同的结果类型进行并行归约,我们需要两个函数:一个将T 元素累加到中间 U 值,第二个将中间 U 值组合成单个 U 结果。如果我们不切换类型,结果是累加器功能与组合器功能相同。这就是为什么归约到相同类型只有累加器功能而归约到不同类型需要单独的累加器和组合器功能。
Finally, Java doesn't provide foldLeft
and foldRight
operations because they imply a particular ordering of operations that is inherently sequential. This clashes with the design principle stated above of providing APIs that support sequential and parallel operation equally.
最后,Java 不提供foldLeft
andfoldRight
操作,因为它们暗示了固有顺序的特定操作顺序。这与上述提供同样支持顺序和并行操作的 API 的设计原则相冲突。
回答by quiz123
There is no reduceversion that takes two different types without a combinersince it can't be executed in parallel (not sure why this is a requirement). The fact that accumulatormust be associative makes this interface pretty much useless since:
没有没有组合器的reduce版本可以采用两种不同的类型,因为它不能并行执行(不知道为什么这是一个要求)。累加器必须是关联的这一事实使得这个接口几乎没有用,因为:
list.stream().reduce(identity,
accumulator,
combiner);
Produces the same results as:
产生与以下相同的结果:
list.stream().map(i -> accumulator(identity, i))
.reduce(identity,
combiner);
回答by Luigi Cortese
Since I like doodles and arrows to clarify concepts... let's start!
因为我喜欢用涂鸦和箭头来阐明概念......让我们开始吧!
From String to String (sequential stream)
从字符串到字符串(顺序流)
Suppose having 4 strings: your goal is to concatenate such strings into one. You basically start with a type and finish with the same type.
假设有 4 个字符串:您的目标是将这些字符串连接成一个。你基本上从一个类型开始并以相同的类型结束。
You can achieve this with
你可以用
String res = Arrays.asList("one", "two","three","four")
.stream()
.reduce("",
(accumulatedStr, str) -> accumulatedStr + str); //accumulator
and this helps you to visualize what's happening:
这有助于您想象正在发生的事情:
The accumulator function converts, step by step, the elements in your (red) stream to the final reduced (green) value. The accumulator function simply transforms a String
object into another String
.
累加器函数逐步将(红色)流中的元素转换为最终减少的(绿色)值。accumulator 函数只是将一个String
对象转换为另一个对象String
。
From String to int (parallel stream)
从 String 到 int(并行流)
Suppose having the same 4 strings: your new goal is to sum their lengths, and you want to parallelize your stream.
假设有相同的 4 个字符串:您的新目标是将它们的长度相加,并且您想要并行化您的流。
What you need is something like this:
你需要的是这样的:
int length = Arrays.asList("one", "two","three","four")
.parallelStream()
.reduce(0,
(accumulatedInt, str) -> accumulatedInt + str.length(), //accumulator
(accumulatedInt, accumulatedInt2) -> accumulatedInt + accumulatedInt2); //combiner
and this is a scheme of what's happening
这是正在发生的事情的计划
Here the accumulator function (a BiFunction
) allows you to transform your String
data to an int
data. Being the stream parallel, it's splitted in two (red) parts, each of which is elaborated independently from eachother and produces just as many partial (orange) results. Defining a combiner is needed to provide a rule for merging partial int
results into the final (green) int
one.
在这里,累加器函数 (a BiFunction
) 允许您将String
数据转换为int
数据。作为并行流,它被分成两个(红色)部分,每个部分都独立于彼此进行阐述,并产生同样多的部分(橙色)结果。需要定义一个组合器来提供将部分int
结果合并到最终(绿色)结果中的规则int
。
From String to int (sequential stream)
从 String 到 int(顺序流)
What if you don't want to parallelize your stream? Well, a combiner needs to be provided anyway, but it will never be invoked, given that no partial results will be produced.
如果你不想并行化你的流怎么办?好吧,无论如何都需要提供一个组合器,但它永远不会被调用,因为不会产生部分结果。