我可以在 Java 8 中复制 Stream 吗?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/24474838/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-14 12:24:54  来源:igfitidea点击:

Can I duplicate a Stream in Java 8?

javajava-8java-stream

提问by necromancer

Sometimes I want to perform a set of operations on a stream, and then process the resulting stream two different ways with other operations.

有时我想对一个流执行一组操作,然后用其他操作以两种不同的方式处理结果流。

Can I do this without having to specify the common initial operations twice?

我可以这样做而不必两次指定常见的初始操作吗?

For example, I am hoping a dup()method such as the following exists:

例如,我希望dup()存在如下方法:

Stream [] desired_streams = IntStream.range(1, 100).filter(n -> n % 2 == 0).dup();
Stream stream14 = desired_streams[0].filter(n -> n % 7 == 0); // multiples of 14
Stream stream10 = desired_streams[1].filter(n -> n % 5 == 0); // multiples of 10

采纳答案by Elazar

It is not possible in general.

一般情况下是不可能的。

If you want to duplicate an input stream, or input iterator, you have two options:

如果要复制输入流或输入迭代器,您有两个选择:

A. Keep everything in a collection, say a List<>

A. 把所有东西都放在一个集合里,比如说 List<>

Suppose you duplicate a stream into two streams s1and s2. If you have advanced n1elements in s1and n2elements with s2, you must keep |n2 - n1|elements in memory, just to keep pace. If your stream is infinite, there may be no upper bound for the storage required.

假设您将一个流复制到两个流中,s1并且s2. 如果您有高级n1元素 ins1n2元素 with s2,则必须将|n2 - n1|元素保留在内存中,以保持同步。如果您的流是无限的,则所需的存储可能没有上限。

Take a look at Python's tee()to see what it takes:

看看 Python 的tee(),看看它需要什么:

This itertool may require significant auxiliary storage (depending on how much temporary data needs to be stored). In general, if one iterator uses most or all of the data before another iterator starts, it is faster to use list()instead of tee().

这个 itertool 可能需要大量的辅助存储(取决于需要存储多少临时数据)。通常,如果一个迭代器在另一个迭代器启动之前使用了大部分或全部数据,则使用list()代替更快tee()

B. When possible: Copy the state of the generator that creates the elements

B. 如果可能:复制创建元素的生成器的状态

For this option to work, you'll probably need access to the inner workings of the stream. In other words, the generator - the part that creates the elements - should support copying in the first place. [OP: See this great answer, as an example of how this can be done for the example in the question]

要使此选项起作用,您可能需要访问流的内部工作原理。换句话说,生成器——创建元素的部分——首先应该支持复制。[OP:请参阅这个很好的答案,作为如何为问题中的示例完成此操作的示例]

It will not work on input from the user, since you'll have to copy the state of the whole "outside world". Java's Streamdo not support copying, since it is designed to be as general as possible, specifically to work with files, network, keyboard, sensors, randomness etc. [OP: Another example is a stream that reads a temperature sensor on demand. It cannot be duplicated without storing a copy of the readings]

它不适用于用户的输入,因为您必须复制整个“外部世界”的状态。JavaStream不支持复制,因为它被设计为尽可能通用,专门用于处理文件、网络、键盘、传感器、随机性等。 [OP:另一个示例是按需读取温度传感器的流。如果不存储读数的副本,则无法复制]

This is not only the case in Java; this is a general rule. You can see that std::istreamin C++ only supports move semantics, not copy semantics ("copy constructor (deleted)"), for this reason (and others).

这不仅是 Java 的情况;这是一般规则。您可以看到std::istream在 C++ 中仅支持移动语义,不支持复制语义(“复制构造函数(已删除)”),因此(和其他原因)。

回答by nosid

It is not possible to duplicate a stream in this way. However, you can avoid the code duplication by moving the common part into a method or lambda expression.

不可能以这种方式复制流。但是,您可以通过将公共部分移动到方法或 lambda 表达式中来避免代码重复。

Supplier<IntStream> supplier = () ->
    IntStream.range(1, 100).filter(n -> n % 2 == 0);
supplier.get().filter(...);
supplier.get().filter(...);

回答by necromancer

Update:This doesn'twork. See explanation below, after the text of the original answer.

更新:不起作用。请参阅下面的解释,在原始答案的文本之后。

How silly of me. All that I need to do is:

我多傻啊。我需要做的就是:

Stream desired_stream = IntStream.range(1, 100).filter(n -> n % 2 == 0);
Stream stream14 = desired_stream.filter(n -> n % 7 == 0); // multiples of 14
Stream stream10 = desired_stream.filter(n -> n % 5 == 0); // multiples of 10

Explanation why this does not work:

解释为什么这不起作用:

If you code it up and try to collect both streams, the first one will collect fine, but trying to stream the second one will throw the exception: java.lang.IllegalStateException: stream has already been operated upon or closed.

如果您对其进行编码并尝试收集两个流,则第一个会很好地收集,但尝试流式传输第二个将引发异常:java.lang.IllegalStateException: stream has already been operated upon or closed

To elaborate, streams are stateful objects (which by the way cannot be reset or rewound). You can think of them as iterators, which in turn are like pointers. So stream14and stream10can be thought of as references to the same pointer. Consuming the first stream all the way will cause the pointer to go "past the end." Trying to consume the second stream is like trying to access a pointer that is already "past the end," Which naturally is an illegal operation.

详细地说,流是有状态的对象(顺便说一下,它不能被重置或倒带)。您可以将它们视为迭代器,而迭代器又类似于指针。所以stream14stream10可以被认为是对同一个指针的引用。一直使用第一个流将导致指针“越过末尾”。尝试使用第二个流就像尝试访问已经“越过末尾”的指针,这自然是非法操作。

As the accepted answer shows, the code to create the stream must be executed twice but it can be compartmentalized into a Supplierlambda or a similar construct.

正如公认的答案所示,创建流的代码必须执行两次,但可以将其划分为Supplierlambda 或类似的构造。

Full test code:save into Foo.java, then javac Foo.java, then java Foo

完整的测试代码:保存到Foo.java,然后javac Foo.java,然后java Foo

import java.util.stream.IntStream;

public class Foo {
  public static void main (String [] args) {
    IntStream s = IntStream.range(0, 100).filter(n -> n % 2 == 0);
    IntStream s1 = s.filter(n -> n % 5 == 0);
    s1.forEach(n -> System.out.println(n));
    IntStream s2 = s.filter(n -> n % 7 == 0);
    s2.forEach(n -> System.out.println(n));
  }
}

Output:

输出:

$ javac Foo.java
$ java Foo
0
10
20
30
40
50
60
70
80
90
Exception in thread "main" java.lang.IllegalStateException: stream has already been operated upon or closed
    at java.util.stream.AbstractPipeline.<init>(AbstractPipeline.java:203)
    at java.util.stream.IntPipeline.<init>(IntPipeline.java:91)
    at java.util.stream.IntPipeline$StatelessOp.<init>(IntPipeline.java:592)
    at java.util.stream.IntPipeline.<init>(IntPipeline.java:332)
    at java.util.stream.IntPipeline.filter(IntPipeline.java:331)
    at Foo.main(Foo.java:8)

回答by Boris the Spider

Either,

任何一个,

  • Move the initialisation into a method, and simply call the method again
  • 将初始化移动到一个方法中,然后再次调用该方法

This has the advantage of being explicit about what you are doing, and also works for infinite streams.

这具有明确说明您在做什么的优点,并且也适用于无限流。

  • Collect the stream and then re-stream it
  • 收集流,然后重新流式传输

In your example:

在你的例子中:

final int[] arr = IntStream.range(1, 100).filter(n -> n % 2 == 0).toArray();

Then

然后

final IntStream s = IntStream.of(arr);

回答by Lukas Eder

It's possible if you're buffering elements that you've consumed in one duplicate, but not in the other yet.

如果您正在缓冲您在一个副本中使用过但尚未在另一个中使用的元素,则这是可能的。

We've implemented a duplicate()method for streams in jOOλ, an Open Source library that we created to improve integration testing for jOOQ. Essentially, you can just write:

我们已经duplicate()jOOλ 中实现了一种流方法,这是一个我们创建的开源库,用于改进jOOQ 的集成测试。本质上,你可以只写:

Tuple2<Seq<Integer>, Seq<Integer>> desired_streams = Seq.seq(
    IntStream.range(1, 100).filter(n -> n % 2 == 0).boxed()
).duplicate();

(note: we currently need to box the stream, as we haven't implemented an IntSeqyet)

(注意:我们目前需要装箱流,因为我们还没有实现IntSeq

Internally, there is a LinkedListbuffer storing all values that have been consumed from one stream but not from the other. That's probably as efficient as it gets if your two streams are consumed about at the same rate.

在内部,有一个LinkedList缓冲区存储已从一个流但未从另一个流消耗的所有值。如果您的两个流以大致相同的速度消耗,这可能与它的效率一样高。

Here's how the algorithm works:

以下是算法的工作原理:

static <T> Tuple2<Seq<T>, Seq<T>> duplicate(Stream<T> stream) {
    final LinkedList<T> gap = new LinkedList<>();
    final Iterator<T> it = stream.iterator();

    @SuppressWarnings("unchecked")
    final Iterator<T>[] ahead = new Iterator[] { null };

    class Duplicate implements Iterator<T> {
        @Override
        public boolean hasNext() {
            if (ahead[0] == null || ahead[0] == this)
                return it.hasNext();

            return !gap.isEmpty();
        }

        @Override
        public T next() {
            if (ahead[0] == null)
                ahead[0] = this;

            if (ahead[0] == this) {
                T value = it.next();
                gap.offer(value);
                return value;
            }

            return gap.poll();
        }
    }

    return tuple(seq(new Duplicate()), seq(new Duplicate()));
}

More source code here

更多源代码在这里

In fact, using jOOλ, you'll be able to write a complete one-liner like so:

事实上,使用jOOλ,您将能够编写一个完整的单行代码,如下所示:

Tuple2<Seq<Integer>, Seq<Integer>> desired_streams = Seq.seq(
    IntStream.range(1, 100).filter(n -> n % 2 == 0).boxed()
).duplicate()
 .map1(s -> s.filter(n -> n % 7 == 0))
 .map2(s -> s.filter(n -> n % 5 == 0));

// This will yield 14, 28, 42, 56...
desired_streams.v1.forEach(System.out::println)

// This will yield 10, 20, 30, 40...
desired_streams.v2.forEach(System.out::println);

回答by Tomasz Górka

You can also move the stream generation into separate method/function that returns this stream and call it twice.

您还可以将流生成移动到单独的方法/函数中,该方法/函数返回此流并调用它两次。

回答by Blundell

For non-infinite streams, if you have access to the source, its straight forward:

对于非无限流,如果您可以访问源,则直接:

@Test
public void testName() throws Exception {
    List<Integer> integers = Arrays.asList(1, 2, 4, 5, 6, 7, 8, 9, 10);
    Stream<Integer> stream1 = integers.stream();
    Stream<Integer> stream2 = integers.stream();

    stream1.forEach(System.out::println);
    stream2.forEach(System.out::println);
}

prints

1 2 4 5 6 7 8 9 10

1 2 4 5 6 7 8 9 10

印刷

1 2 4 5 6 7 8 9 10

1 2 4 5 6 7 8 9 10

For your case:

对于您的情况:

Stream originalStream = IntStream.range(1, 100).filter(n -> n % 2 == 0)

List<Integer> listOf = originalStream.collect(Collectors.toList())

Stream stream14 = listOf.stream().filter(n -> n % 7 == 0);
Stream stream10 = listOf.stream().filter(n -> n % 5 == 0);

For performance etc, read someone else's answer ;)

对于性能等,请阅读其他人的答案;)