Java 我什么时候应该使用流?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/42486428/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-12 00:11:04  来源:igfitidea点击:

When should I use streams?

javajava-8java-stream

提问by mcuenez

I just came across a question when using a Listand its stream()method. While I know howto use them, I'm not quite sure about whento use them.

我刚刚在使用 aList及其stream()方法时遇到了一个问题。虽然我知道如何使用它们,但我不太确定何时使用它们。

For example, I have a list, containing various paths to different locations. Now, I'd like to check whether a single, given path contains any of the paths specified in the list. I'd like to return a booleanbased on whether or not the condition was met.

例如,我有一个列表,其中包含到不同位置的各种路径。现在,我想检查一个给定的路径是否包含列表中指定的任何路径。我想boolean根据是否满足条件返回 a 。

This of course, is not a hard task per se. But I wonder whether I should use streams, or a for(-each) loop.

当然,这本身并不是一项艰巨的任务。但我想知道我应该使用流还是 for(-each) 循环。

The List

列表

private static final List<String> EXCLUDE_PATHS = Arrays.asList(new String[]{
    "my/path/one",
    "my/path/two"
});

Example - Stream

示例 - 流

private boolean isExcluded(String path){
    return EXCLUDE_PATHS.stream()
                        .map(String::toLowerCase)
                        .filter(path::contains)
                        .collect(Collectors.toList())
                        .size() > 0;
}

Example - For-Each Loop

示例 - For-Each 循环

private boolean isExcluded(String path){
    for (String excludePath : EXCLUDE_PATHS) {
        if(path.contains(excludePath.toLowerCase())){
            return true;
        }
    }
    return false;
}

Notethat the pathparameter is always lowercase.

请注意path参数始终为小写

My first guess is that the for-each approach is faster, because the loop would return immediately, if the condition is met. Whereas the stream would still loop over all list entries in order to complete filtering.

我的第一个猜测是 for-each 方法更快,因为如果满足条件,循环会立即返回。而流仍会遍历所有列表条目以完成过滤。

Is my assumption correct? If so, why(or rather when) would I use stream()then?

我的假设正确吗?如果是这样,为什么(或者更确切地说是什么时候)我会使用stream()呢?

采纳答案by Stefan Pries

Your assumption is correct. Your stream implementation is slower than the for-loop.

你的假设是正确的。您的流实现比 for 循环慢。

This stream usage should be as fast as the for-loop though:

这个流的使用应该和 for 循环一样快:

EXCLUDE_PATHS.stream()  
                               .map(String::toLowerCase)
                               .anyMatch(path::contains);

This iterates through the items, applying String::toLowerCaseand the filter to the items one-by-one and terminating at the first itemthat matches.

这将遍历项目,将String::toLowerCase过滤器逐个应用于项目,并在匹配的第一个项目处终止

Both collect()& anyMatch()are terminal operations. anyMatch()exits at the first found item, though, while collect()requires all items to be processed.

两者collect()anyMatch()是终端的操作。anyMatch()但是,在第一个找到的项目处退出,而collect()需要处理所有项目。

回答by rvit34

Yeah. You are right. Your stream approach will have some overhead. But you may use such a construction:

是的。你是对的。您的流方法会有一些开销。但是你可以使用这样的结构:

private boolean isExcluded(String path) {
    return  EXCLUDE_PATHS.stream().map(String::toLowerCase).anyMatch(path::contains);
}

The main reason to use streams is that they make your code simpler and easy to read.

使用流的主要原因是它们使您的代码更简单易读。

回答by Paulo Ricardo Almeida

The goal of streams in Java is to simplify the complexity of writing parallel code. It's inspired by functional programming. The serial stream is just to make the code cleaner.

Java 中流的目标是简化编写并行代码的复杂性。它的灵感来自函数式编程。串行流只是为了使代码更清晰。

If we want performance we should use parallelStream, which was designed to. The serial one, in general, is slower.

如果我们想要性能,我们应该使用parallelStream,它的设计目的是。一般来说,串行的速度较慢。

There is a good article to read about ForLoop, Streamand ParallelStreamPerformance.

有一篇很好的文章可以阅读还有性能ForLoopStreamParallelStream

In your code we can use termination methods to stop the search on the first match. (anyMatch...)

在您的代码中,我们可以使用终止方法来停止第一个匹配项的搜索。(任何匹配...)

回答by Holger

The decision whether to use Streams or not should not be driven by performance consideration, but rather by readability. When it really comes to performance, there are other considerations.

是否使用 Streams 的决定不应由性能考虑驱动,而是由可读性驱动。当谈到性能时,还有其他考虑因素。

With your .filter(path::contains).collect(Collectors.toList()).size() > 0approach, you are processing all elements and collecting them into a temporary List, before comparing the size, still, this hardly ever matters for a Stream consisting of two elements.

使用您的.filter(path::contains).collect(Collectors.toList()).size() > 0方法,您正在处理所有元素并将它们收集到一个临时List,然后再比较大小,但是,这对于由两个元素组成的 Stream 几乎无关紧要。

Using .map(String::toLowerCase).anyMatch(path::contains)can save CPU cycles and memory, if you have a substantially larger number of elements. Still, this converts each Stringto its lowercase representation, until a match is found. Obviously, there is a point in using

.map(String::toLowerCase).anyMatch(path::contains)如果您有大量元素,使用可以节省 CPU 周期和内存。尽管如此,这会将每个String转换为其小写表示,直到找到匹配项。显然,有一点使用

private static final List<String> EXCLUDE_PATHS =
    Stream.of("my/path/one", "my/path/two").map(String::toLowerCase)
          .collect(Collectors.toList());

private boolean isExcluded(String path) {
    return EXCLUDE_PATHS.stream().anyMatch(path::contains);
}

instead. So you don't have to repeat the conversion to lowcase in every invocation of isExcluded. If the number of elements in EXCLUDE_PATHSor the lengths of the strings becomes really large, you may consider using

反而。因此,您不必在每次调用isExcluded. 如果EXCLUDE_PATHS字符串中的元素数量或长度变得非常大,您可以考虑使用

private static final List<Predicate<String>> EXCLUDE_PATHS =
    Stream.of("my/path/one", "my/path/two").map(String::toLowerCase)
          .map(s -> Pattern.compile(s, Pattern.LITERAL).asPredicate())
          .collect(Collectors.toList());

private boolean isExcluded(String path){
    return EXCLUDE_PATHS.stream().anyMatch(p -> p.test(path));
}

Compiling a string as regex pattern with the LITERALflag, makes it behave just like ordinary string operations, but allows the engine to spent some time in preparation, e.g. using the Boyer Moore algorithm, to be more efficient when it comes to the actual comparison.

将字符串编译为带有LITERAL标志的正则表达式模式,使其行为就像普通的字符串操作一样,但允许引擎花一些时间进行准备,例如使用 Boyer Moore 算法,以便在实际比较时更高效。

Of course, this only pays off if there are enough subsequent tests to compensate the time spent in preparation. Determining whether this will be the case, is one of the actual performance considerations, besides the first question whether this operation will ever be performance critical at all. Not the question whether to use Streams or forloops.

当然,这只有在有足够多的后续测试来补偿准备工作所花费的时间时才会有回报。确定是否会是这种情况是实际性能考虑之一,除了第一个问题之外,此操作是否永远是性能关键。不是使用流还是for循环的问题。

By the way, the code examples above keep the logic of your original code, which looks questionable to me. Your isExcludedmethod returns true, if the specified path contains any of the elements in list, so it returns truefor /some/prefix/to/my/path/one, as well as my/path/one/and/some/suffixor even /some/prefix/to/my/path/one/and/some/suffix.

顺便说一句,上面的代码示例保留了原始代码的逻辑,这在我看来是有问题的。如果指定的路径包含列表中的任何元素,则您的isExcluded方法返回true,因此它返回truefor /some/prefix/to/my/path/one,以及my/path/one/and/some/suffix甚至/some/prefix/to/my/path/one/and/some/suffix

Even dummy/path/onerousis considered fulfilling the criteria as it containsthe string my/path/one

Evendummy/path/onerous被认为满足标准,因为它contains是字符串my/path/one......

回答by Kaicheng Hu

As others have mentioned many good points, but I just want to mention lazy evaluationin stream evaluation. When we do map()to create a stream of lower case paths, we are not creating the whole stream immediately, instead the stream is lazily constructed, which is why the performance should be equivalent to the traditional for loop. It is not doing a full scanning, map()and anyMatch()are executed at the same time. Once anyMatch()returns true, it will be short-circuited.

正如其他人提到的很多优点,但我只想提一下流评估中的惰性评估。当我们map()创建小写路径的流时,我们不会立即创建整个流,而是懒惰地构造流,这就是为什么性能应该等同于传统的 for 循环。这是不是做一个完整的扫描,map()anyMatch()在同一时间执行。一旦anyMatch()返回true,它将被短路。