有条件地向 Java 8 流添加操作

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/33746357/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-02 22:05:19  来源:igfitidea点击:

Conditionally add an operation to a Java 8 stream

javajava-8limitjava-stream

提问by Lani

I'm wondering if I can add an operation to a stream, based off of some sort of condition set outside of the stream. For example, I want to add a limit operation to the stream if my limitvariable is not equal to -1.

我想知道是否可以根据流外部设置的某种条件向流添加操作。例如,如果我的limit变量不等于,我想向流添加限制操作-1

My code currently looks like this, but I have yet to see other examples of streams being used this way, where a Stream object is reassigned to the result of an intermediate operation applied on itself:

我的代码目前看起来像这样,但我还没有看到以这种方式使用流的其他示例,其中将 Stream 对象重新分配给应用于自身的中间操作的结果:

// Do some stream stuff
stream = stream.filter(e -> e.getTimestamp() < max);

// Limit the stream
if (limit != -1) {
   stream = stream.limit(limit);
}

// Collect stream to list
stream.collect(Collectors.toList());

As stated in this stackoverflow post, the filter isn't actually applied until a terminal operation is called. Since I'm reassigning the value of stream before a terminal operation is called, is the above code still a proper way to use Java 8 streams?

正如此stackoverflow post 中所述,在调用终端操作之前实际上不会应用过滤器。由于我在调用终端操作之前重新分配了流的值,上面的代码仍然是使用 Java 8 流的正确方法吗?

回答by Holger

There is no semantic difference between a chained series of invocations and a series of invocations storing the intermediate return values. Thus, the following code fragments are equivalent:

链式调用系列和存储中间返回值的系列调用之间没有语义差异。因此,以下代码片段是等效的:

a = object.foo();
b = a.bar();
c = b.baz();

and

c = object.foo().bar().baz();

In either case, each method is invoked on the result of the previous invocation. But in the latter case, the intermediate results are not stored but lost on the next invocation. In the case of the stream API, the intermediate results must notbe used after you have called the next method on it, thus chaining is the natural way of using stream as it intrinsically ensures that you don't invoke more than one method on a returned reference.

在任何一种情况下,每个方法都是根据前一次调用的结果调用的。但在后一种情况下,中间结果不会被存储,而是会在下一次调用时丢失。在流 API 的情况下,中间结果不得在您调用下一个方法后使用,因此链接是使用流的自然方式,因为它本质上确保您不会在一个方法上调用多个方法返回参考。

Still, it is not wrong to store the reference to a stream as long as you obey the contract of not using a returned reference more than once. By using it they way as in your question, i.e. overwriting the variable with the result of the next invocation, you also ensure that you don't invoke more than one method on a returned reference, thus, it's a correct usage. Of course, this only works with intermediate results of the same type, so when you are using mapor flatMap, getting a stream of a different reference type, you can't overwrite the local variable. Then you have to be careful to not use the old local variable again, but, as said, as long as you are not using it after the next invocation, there is nothing wrong with the intermediate storage.

尽管如此,只要遵守不多次使用返回引用的约定,存储对流的引用并没有错。通过像在您的问题中那样使用它,即用下一次调用的结果覆盖变量,您还可以确保不会对返回的引用调用多个方法,因此,这是正确的用法。当然,这只适用于相同类型的中间结果,因此当您使用mapor 时flatMap,获取不同引用类型的流时,您不能覆盖局部变量。那么你要注意不要再次使用旧的局部变量,但是,只要你在下次调用后不使用它,中间存储就没有问题。

Sometimes, you haveto store it, e.g.

有时,您必须存储它,例如

try(Stream<String> stream = Files.lines(Paths.get("myFile.txt"))) {
    stream.filter(s -> !s.isEmpty()).forEach(System.out::println);
}

Note that the code is equivalent to the following alternatives:

请注意,该代码等效于以下替代方案:

try(Stream<String> stream = Files.lines(Paths.get("myFile.txt")).filter(s->!s.isEmpty())) {
    stream.forEach(System.out::println);
}

and

try(Stream<String> srcStream = Files.lines(Paths.get("myFile.txt"))) {
    Stream<String> tmp = srcStream.filter(s -> !s.isEmpty());
    // must not be use variable srcStream here:
    tmp.forEach(System.out::println);
}

They are equivalent because forEachis always invoked on the result of filterwhich is always invoked on the result of Files.linesand it doesn't matter on which result the final close()operation is invoked as closing affects the entire stream pipeline.

它们是等价的,因为forEach总是在结果上filter调用 which 总是在结果上调用Files.lines并且最终close()操作被调用的结果无关紧要,因为关闭会影响整个流管道。



To put it in one sentence, the way you use it, is correct.

一言以蔽之,你用​​它的方式,是对的。



I even preferto do it that way, as not chaining a limitoperation when you don't want to apply a limit is the cleanest way of expression your intent. It's also worth noting that the suggested alternatives may work in a lot of cases, but they are notsemantically equivalent:

我什至更喜欢这样做,因为limit当您不想应用限制时不链接操作是表达意图的最清晰方式。还值得注意的是,建议的替代方案可能适用于很多情况,但它们在语义上并不等效:

.limit(condition? aLimit: Long.MAX_VALUE)

assumes that the maximum number of elements, you can ever encounter, is Long.MAX_VALUEbut streams can have more elements than that, they even might be infinite.

假设您可能遇到的最大元素数是Long.MAX_VALUE但流可以包含更多元素,它们甚至可能是无限的。

.limit(condition? aLimit: list.size())

when the stream source is list, is breaking the lazy evaluation of a stream. In principle, a mutable stream source might legally get arbitrarily changed up to the point when the terminal action is commenced. The result will reflect all modifications made up to this point. When you add an intermediate operation incorporating list.size(), i.e. the actual size of the list at this point, subsequent modifications applied to the collection between this point and the terminal operation may turn this value to have a different meaning than the intended “actually no limit” semantic.

当流源为 时list,会破坏流的惰性求值。原则上,可变流源可以合法地在终端操作开始之前任意更改。结果将反映到目前为止所做的所有修改。当您添加一个中间操作合并时list.size(),即此时列表的实际大小,应用于此点和终端操作之间的集合的后续修改可能会使此值具有与预期的“实际上没有限制”语义不同的含义.

Compare with “Non Interference” section of the API documentation:

API 文档的“非干扰”部分进行比较:

For well-behaved stream sources, the source can be modified before the terminal operation commences and those modifications will be reflected in the covered elements. For example, consider the following code:

List<String> l = new ArrayList(Arrays.asList("one", "two"));
Stream<String> sl = l.stream();
l.add("three");
String s = sl.collect(joining(" "));

First a list is created consisting of two strings: "one"; and "two". Then a stream is created from that list. Next the list is modified by adding a third string: "three". Finally the elements of the stream are collected and joined together. Since the list was modified before the terminal collect operation commenced the result will be a string of "one two three".

对于行为良好的流源,可以在终端操作开始之前修改源,这些修改将反映在涵盖的元素中。例如,考虑以下代码:

List<String> l = new ArrayList(Arrays.asList("one", "two"));
Stream<String> sl = l.stream();
l.add("three");
String s = sl.collect(joining(" "));

首先创建一个由两个字符串组成的列表:“one”;和“两个”。然后从该列表创建一个流。接下来通过添加第三个字符串来修改列表:“three”。最后,流的元素被收集并连接在一起。由于列表在终端收集操作开始之前被修改,结果将是一串“一二三”。

Of course, this is a rare corner case as normally, a programmer will formulate an entire stream pipeline without modifying the source collection in between. Still, the different semantic remains and it might turn into a very hard to find bug when you once enter such a corner case.

当然,这是一种罕见的极端情况,因为程序员通常会在不修改其间的源集合的情况下制定整个流管道。尽管如此,不同的语义仍然存在,一旦进入这样的极端情况,它可能会变成一个很难发现的错误。

Further, since they are not equivalent, the stream API will never recognize these values as “actually no limit”. Even specifying Long.MAX_VALUEimplies that the stream implementation has to track the number of processed elements to ensure that the limit has been obeyed. Thus, not adding a limitoperation can have a significant performance advantage over adding a limit with a number that the programmer expects to never be exceeded.

此外,由于它们不是等价的,因此流 API 永远不会将这些值识别为“实际上没有限制”。即使指定也Long.MAX_VALUE意味着流实现必须跟踪已处理元素的数量以确保遵守限制。因此,不添加limit操作比添加一个程序员期望永远不会超过的数字的限制具有显着的性能优势。

回答by Peter Lawrey

There is two ways you can do this

有两种方法可以做到这一点

// Do some stream stuff
List<E> results = list.stream()
                  .filter(e -> e.getTimestamp() < max);
                  .limit(limit > 0 ? limit : list.size())
                  .collect(Collectors.toList());

OR

或者

// Do some stream stuff
stream = stream.filter(e -> e.getTimestamp() < max);

// Limit the stream
if (limit != -1) {
   stream = stream.limit(limit);
}

// Collect stream to list
List<E> results = stream.collect(Collectors.toList());

As this is functionalprogramming you should always work on the result of each function. You should specifically avoid modifying anything in this style of programming and treat everything as if it was immutable if possible.

由于这是函数式编程,您应该始终处理每个函数的结果。您应该特别避免以这种编程风格修改任何内容,并尽可能将所有内容视为不可变的。

Since I'm reassigning the value of stream before a terminal operation is called, is the above code still a proper way to use Java 8 streams?

由于我在调用终端操作之前重新分配了流的值,上面的代码仍然是使用 Java 8 流的正确方法吗?

It should work, however it reads as a mix of imperative and functional coding. I suggest writing it as a fixed stream as per my first answer.

它应该可以工作,但是它读起来像是命令式和函数式编码的混合。我建议按照我的第一个答案将其编写为固定流。

回答by WillShackleford

I think your first line needs to be:

我认为你的第一行需要是:

stream = stream.filter(e -> e.getTimestamp() < max);

so that your using the stream returned by filter in subsequent operations rather than the original stream.

以便您在后续操作中使用过滤器返回的流而不是原始流。