Scala 中的 zipWith(映射多个 Seq)

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1157564/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-22 01:32:37  来源:igfitidea点击:

zipWith (mapping over multiple Seq) in Scala

scalafunctional-programminghigher-order-functions

提问by bsdfish

Suppose I have

假设我有

val foo : Seq[Double] = ...
val bar : Seq[Double] = ...

and I wish to produce a seq where the baz(i) = foo(i) + bar(i). One way I can think of to do this is

我希望生成一个 seq,其中 baz(i) = foo(i) + bar(i)。我能想到的一种方法是

val baz : Seq[Double] = (foo.toList zip bar.toList) map ((f: Double, b : Double) => f+b)

However, this feels both ugly and inefficient -- I have to convert both seqs to lists (which explodes with lazy lists), create this temporary list of tuples, only to map over it and let it be GCed. Maybe streams solve the lazy problem, but in any case, this feels like unnecessarily ugly. In lisp, the map function would map over multiple sequences. I would write

然而,这感觉既丑陋又低效——我必须将两个 seqs 转换为列表(它会因惰性列表而爆炸),创建这个临时的元组列表,只是为了映射它并让它被 GCed。也许流解决了懒惰的问题,但无论如何,这感觉不必要地丑陋。在 lisp 中,map 函数会映射多个序列。我会写

(mapcar (lambda (f b) (+ f b)) foo bar)

And no temporary lists would get created anywhere. Is there a map-over-multiple-lists function in Scala, or is zip combined with destructuring really the 'right' way to do this?

并且不会在任何地方创建临时列表。Scala 中是否有 map-over-multiple-lists 函数,或者 zip 结合解构真的是“正确”的方法吗?

采纳答案by Alexey Romanov

The function you want is called zipWith, but it isn't a part of the standard library. It will be in 2.8 (UPDATE: Apparently not, see comments).

您想要的函数被称为zipWith,但它不是标准库的一部分。它将在 2.8(更新:显然不是,请参阅评论)。

foo zipWith((f: Double, b : Double) => f+b) bar

See this Trac ticket.

请参阅此 Trac 票

回答by Martin Odersky

In Scala 2.8:

在 Scala 2.8 中:

val baz = (foo, bar).zipped map (_ + _)

And it works for more than two operands in the same way. I.e. you could then follow this up with:

它以相同的方式适用于两个以上的操作数。也就是说,您可以接着执行以下操作:

(foo, bar, baz).zipped map (_ * _ * _)

回答by Daniel C. Sobral

Well, that, the lack of zip, is a deficiency in Scala's 2.7 Seq. Scala 2.8 has a well-thought collection design, to replace the ad-hoc way the collections present in 2.7 came to be (note that they weren't all created at once, with an unified design).

好吧,缺少 zip 是 Scala 2.7 Seq 的一个缺陷。Scala 2.8 有一个经过深思熟虑的集合设计,以取代 2.7 中出现的集合的特殊方式(请注意,它们不是同时创建的,具有统一的设计)。

Now, when you want to avoid creating temporary collection, you should use "projection" on Scala 2.7, or "view" on Scala 2.8. This will give you a collection type for which certain instructions, particularly map, flatMap and filter, are non-strict. On Scala 2.7, the projection of a List is a Stream. On Scala 2.8, there is a SequenceView of a Sequence, but there is a zipWith right there in the Sequence, you wouldn't even need it.

现在,当您想避免创建临时集合时,您应该在 Scala 2.7 上使用“projection”,或在 Scala 2.8 上使用“view”。这将为您提供一个集合类型,其中某些指令,特别是 map、flatMap 和 filter,是非严格的。在 Scala 2.7 上,列表的投影是一个流。在 Scala 2.8 上,有一个 Sequence 的 SequenceView,但在 Sequence 中有一个 zipWith,你甚至不需要它。

Having said that, as mentioned, JVM is optimized to handle temporary object allocations, and, when running in server mode, the run-time optimization can do wonders. So, do not optimize prematurely. Test the code in the conditions it will be run -- and if you haven't planned to run it in server mode, then rethink that if the code is expected to be long-running, and optmize when/where/if necessary.

话虽如此,如前所述,JVM 经过优化以处理临时对象分配,并且在服务器模式下运行时,运行时优化可以创造奇迹。所以,不要过早优化。在将运行的条件下测试代码——如果您还没有计划在服务器模式下运行它,那么重新考虑代码是否需要长时间运行,并在必要时/地点/必要时进行优化。

EDIT

编辑

What is actually going to be available on Scala 2.8 is this:

Scala 2.8 实际可用的是:

(foo,bar).zipped.map(_+_)

回答by Daniel Earwicker

A lazy list isn't a copy of a list - it's more like a single object. In the case of a lazy zip implementation, each time it is asked for the next item, it grabs an item from each of the two input lists and creates a tuple from them, and you then break the tuple apart with the pattern-matching in your lambda.

惰性列表不是列表的副本 - 它更像是单个对象。在惰性 zip 实现的情况下,每次请求下一个项目时,它从两个输入列表中的每一个中获取一个项目并从中创建一个元组,然后您使用模式匹配将元组分开你的拉姆达。

So there's never a need to create a complete copy of the whole input list(s) before starting to operate on them. It boils down to a very similar allocation pattern to any application running on the JVM - lots of very short-lived but small allocations, which the JVM is optimised to deal with.

因此,在开始对它们进行操作之前,永远不需要创建整个输入列表的完整副本。它归结为与在 JVM 上运行的任何应用程序非常相似的分配模式——许多非常短暂但很小的分配,JVM 经过优化可以处理。

Update:to be clear, you need to be using Streams (lazy lists) not Lists. Scala's streams have a zip that works the lazy way, and so you shouldn't be converting things into lists.

更新:要清楚,您需要使用流(惰性列表)而不是列表。Scala 的流有一个以惰性方式工作的 zip,因此您不应该将内容转换为列表。

Ideally your algorithm should be capable of working on two infinitestreams without blowing up (assuming it doesn't do any folding, of course, but just reads and generates streams).

理想情况下,您的算法应该能够处理两个无限流而不会爆炸(folding当然,假设它不执行任何操作,而只是读取和生成流)。

回答by Daniel Spiewak

UPDATE:It has been pointed out (in comments) that this "answer" doesn't actually address the question being asked. This answer will map over every combinationof fooand bar, producing N x Melements, instead of the min(M, N)as requested. So, this is wrong, but left for posterity since it's good information.

更新:有人指出(在评论中)这个“答案”实际上并没有解决被问到的问题。这个回答将每映射在组合foobar,产生N×M个元素,而不是分钟(M,N)的要求。所以,这是错误的,但留给后代,因为它是很好的信息。



The best way to do this is with flatMapcombined with map. Code speaks louder than words:

最好的方法是flatMap结合使用map。代码胜于雄辩:

foo flatMap { f => bar map { b => f + b } }

This will produce a single Seq[Double], exactly as you would expect. This pattern is so common that Scala actually includes some syntactic magic which implements it:

这将产生一个Seq[Double],正如您所期望的那样。这种模式非常普遍,以至于 Scala 实际上包含了一些实现它的语法魔法:

for {
  f <- foo
  b <- bar
} yield f + b

Or, alternatively:

或者,或者:

for (f <- foo; b <- bar) yield f + b

The for { ... }syntax is really the most idiomatic way to do this. You can continue to add generator clauses (e.g. b <- bar) as necessary. Thus, if it suddenly becomes threeSeqs that you must map over, you can easily scale your syntax along with your requirements (to coin a phrase).

for { ... }语法是真正做到这一点的最惯用的方式。您可以b <- bar根据需要继续添加生成器子句(例如)。因此,如果它突然变成必须映射的三个Seqs,您可以轻松地根据您的要求扩展您的语法(创造一个短语)。

回答by Tvaroh

When faced a similar task, I added the following pimp to Iterables:

当面临类似的任务时,我在Iterables 中添加了以下 pimp :

implicit class IterableOfIterablePimps[T](collOfColls: Iterable[Iterable[T]]) {
  def mapZipped[V](f: Iterable[T] => V): Iterable[V] = new Iterable[V] {
    override def iterator: Iterator[V] = new Iterator[V] {
      override def next(): V = {
        val v = f(itemsLeft.map(_.head))
        itemsLeft = itemsLeft.map(_.tail)
        v
      }

      override def hasNext: Boolean = itemsLeft.exists(_.nonEmpty)

      private var itemsLeft = collOfColls
    }
  }
}

Having this, one can do something like:

有了这个,你可以做这样的事情:

val collOfColls = List(List(1, 2, 3), List(4, 5, 6), List(7, 8, 9))
collOfColls.mapZipped { group =>
  group // List(1, 4, 7), then List(2, 5, 8), then List(3, 6, 9)
}

Notice that you should carefully consider collection type passed as nested Iterable, since tailand headwill be recurrently called on it. So, ideally you should pass Iterable[List]or othercollection with fast tailand head.

请注意,您应该仔细考虑作为嵌套传递的集合类型Iterable,因为tailhead将被反复调用。所以,理想情况下,你应该通过Iterable[List]其它快速收集tailhead

Also, this code expects nested collections of the same size. That was my use case, but I suspect this can be improved, if needed.

此外,此代码需要相同大小的嵌套集合。那是我的用例,但我怀疑这可以在需要时改进。