list Scala 分区/收集使用

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4784051/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-11 01:35:26  来源:igfitidea点击:

Scala Partition/Collect Usage

listscalacollect

提问by Adrian Modliszewski

Is it possible to use one call to collectto make 2 new lists? If not, how can I do this using partition?

是否可以使用一次调用collect来制作 2 个新列表?如果没有,我该如何使用partition

回答by Kevin Wright

collect(defined on TraversableLikeand available in all subclasses) works with a collection and a PartialFunction. It also just so happens that a bunch of case clauses defined inside braces are a partial function (See section 8.5 of the Scala Language Specification[warning - PDF])

collect(在TraversableLike 上定义并在所有子类中可用)与集合和PartialFunction. 碰巧在大括号内定义的一堆 case 子句是偏函数(请参阅Scala 语言规范[警告 - PDF] 的第 8.5 节)

As in exception handling:

在异常处理中:

try {
  ... do something risky ...
} catch {
  //The contents of this catch block are a partial function
  case e: IOException => ...
  case e: OtherException => ...
}

It's a handy way to define a function that will only accept some values of a given type.

这是定义一个只接受给定类型的某些值的函数的一种方便的方法。

Consider using it on a list of mixed values:

考虑在混合值列表中使用它:

val mixedList = List("a", 1, 2, "b", 19, 42.0) //this is a List[Any]
val results = mixedList collect {
  case s: String => "String:" + s
  case i: Int => "Int:" + i.toString
}

The argument to to collectmethod is a PartialFunction[Any,String]. PartialFunctionbecause it's not defined for all possible inputs of type Any(that being the type of the List) and Stringbecause that's what all the clauses return.

tocollect方法的参数是 a PartialFunction[Any,String]PartialFunction因为它不是为所有可能的类型输入Any(即 的类型List)定义的,并且String因为这就是所有子句返回的内容。

If you tried to use mapinstead of collect, the the double value at the end of mixedListwould cause a MatchError. Using collectjust discards this, as well as any other value for which the PartialFunction is not defined.

如果您尝试使用map代替collect,则末尾的双精度值mixedList将导致MatchError. 使用collectjust 会丢弃此值以及未定义 PartialFunction 的任何其他值。

One possible use would be to apply different logic to elements of the list:

一种可能的用途是对列表元素应用不同的逻辑:

var strings = List.empty[String]
var ints = List.empty[Int]
mixedList collect {
  case s: String => strings :+= s
  case i: Int => ints :+= i
}

Although this is just an example, using mutable variables like this is considered by many to be a war crime - So please don't do it!

虽然这只是一个例子,但使用这样的可变变量被许多人认为是War罪行 - 所以请不要这样做!

A muchbetter solution is to use collect twice:

一个更好的解决方案是两次收集使用:

val strings = mixedList collect { case s: String => s }
val ints = mixedList collect { case i: Int => i }

Or if you know for certain that the list only contains two types of values, you can use partition, which splits a collections into values depending on whether or not they match some predicate:

或者,如果您确定该列表仅包含两种类型的值,则可以使用partition,它根据集合是否匹配某个谓词将集合拆分为多个值:

//if the list only contains Strings and Ints:
val (strings, ints) = mixedList partition { case s: String => true; case _ => false }

The catch here is that both stringsand intsare of type List[Any], though you can easily coerce them back to something more typesafe (perhaps by using collect...)

这里的问题是stringsints都是 type List[Any],尽管您可以轻松地将它们强制转换回更安全的类型(也许通过使用collect...)

If you already have a type-safe collection and want to split on some other property of the elements, then things are a bit easier for you:

如果您已经有一个类型安全的集合并且想要拆分元素的其他一些属性,那么事情对您来说会容易一些:

val intList = List(2,7,9,1,6,5,8,2,4,6,2,9,8)
val (big,small) = intList partition (_ > 5)
//big and small are both now List[Int]s

Hope that sums up how the two methods can help you out here!

希望总结了这两种方法如何帮助您!

回答by Adam Rabung

Not sure how to do it with collectwithout using mutable lists, but partitioncan use pattern matching as well (just a little more verbose)

不知道如何在collect不使用可变列表的情况下做到这一点,但partition也可以使用模式匹配(稍微详细一点)

List("a", 1, 2, "b", 19).partition { 
  case s:String => true
  case _ => false 
}

回答by Rex Kerr

The signature of the normally-used collecton, say, Seq, is

的常使用的签名collect上说,Seq

collect[B](pf: PartialFunction[A,B]): Seq[B]

which is really a particular case of

这确实是一个特例

collect[B, That](pf: PartialFunction[A,B])(
  implicit bf: CanBuildFrom[Seq[A], B, That]
): That

So if you use it in default mode, the answer is no, assuredly not: you get exactly one sequence out from it. If you follow CanBuildFromthrough Builder, you see that it would be possible to make Thatactually be two sequences, but it would have no way of being told which sequence an item should go into, since the partial function can only say "yes, I belong" or "no, I do not belong".

因此,如果您在默认模式下使用它,答案是否定的,当然不是:您只能从中获得一个序列。如果你遵循CanBuildFrom通过Builder,你看,这将有可能使That实际上是两个序列,但它没有办法被告知该序列中的项目应该进入,因为部分功能只能说“是的,我属于”或“不,我不属于”。

So what do you do if you want to have multiple conditions that result in your list being split into a bunch of different pieces? One way is to create an indicator function A => Int, where your Ais mapped into a numbered class, and then use groupBy. For example:

那么,如果您想要多个条件导致您的列表被分成一堆不同的部分,您会怎么做?一种方法是创建一个指标函数A => Int,其中您A的映射到一个编号的类,然后使用groupBy. 例如:

def optionClass(a: Any) = a match {
  case None => 0
  case Some(x) => 1
  case _ => 2
}
scala> List(None,3,Some(2),5,None).groupBy(optionClass)
res11: scala.collection.immutable.Map[Int,List[Any]] = 
  Map((2,List(3, 5)), (1,List(Some(2))), (0,List(None, None)))

Now you can look up your sub-lists by class (0, 1, and 2 in this case). Unfortunately, if you want to ignore some inputs, you still have to put them in a class (e.g. you probably don't care about the multiple copies of Nonein this case).

现在您可以按类(在本例中为 0、1 和 2)查找您的子列表。不幸的是,如果您想忽略某些输入,您仍然必须将它们放在一个类中(例如,None在这种情况下您可能不关心 的多个副本)。

回答by Alex Cruise

I use this. One nice thing about it is it combines partitioning and mapping in one iteration. One drawback is that it does allocate a bunch of temporary objects (the Either.Leftand Either.Rightinstances)

我用这个。关于它的一件好事是它在一次迭代中结合了分区和映射。一个缺点是它确实分配了一堆临时对象(Either.LeftEither.Right实例)

/**
 * Splits the input list into a list of B's and a list of C's, depending on which type of value the mapper function returns.
 */
def mapSplit[A,B,C](in: List[A])(mapper: (A) => Either[B,C]): (List[B], List[C]) = {
  @tailrec
  def mapSplit0(in: List[A], bs: List[B], cs: List[C]): (List[B], List[C]) = {
    in match {
      case a :: as =>
        mapper(a) match {
          case Left(b)  => mapSplit0(as, b :: bs, cs     )
          case Right(c) => mapSplit0(as, bs,      c :: cs)
        }
      case Nil =>
        (bs.reverse, cs.reverse)
    }
  }

  mapSplit0(in, Nil, Nil)
}

val got = mapSplit(List(1,2,3,4,5)) {
  case x if x % 2 == 0 => Left(x)
  case y               => Right(y.toString * y)
}

assertEquals((List(2,4),List("1","333","55555")), got)

回答by Xavier Guihot

Starting Scala 2.13, most collections are now provided with a partitionMapmethod which partitions elements based on a function which returns either Rightor Left.

首先Scala 2.13,现在大多数集合都提供了一种partitionMap方法,该方法根据返回Right或的函数对元素进行分区Left

That allows us to pattern match based on the type (which as a collectenables having specific types in the partitioned lists) or any other pattern:

这允许我们根据类型(作为collect分区列表中的特定类型)或任何其他模式进行模式匹配:

 val (strings, ints) =
   List("a", 1, 2, "b", 19).partitionMap {
     case s: String => Left(s)
     case x: Int    => Right(x)
   }
 // strings: List[String] = List("a", "b")
 // ints: List[Int] = List(1, 2, 19)

回答by LP_

I could not find a satisfying solution to this basic problem here. I don't need a lecture on collectand don't care if this is someone's homework. Also, I don't want something that works only for List.

我在这里找不到这个基本问题的令人满意的解决方案。我不需要讲座collect,也不在乎这是否是某人的作业。另外,我不想要只适用于List.

So here is my stab at it. Efficient and compatible with any TraversableOnce, even strings:

所以这是我的尝试。高效且兼容任何TraversableOnce,甚至字符串:

implicit class TraversableOnceHelper[A,Repr](private val repr: Repr)(implicit isTrav: Repr => TraversableOnce[A]) {

  def collectPartition[B,Left](pf: PartialFunction[A, B])
  (implicit bfLeft: CanBuildFrom[Repr, B, Left], bfRight: CanBuildFrom[Repr, A, Repr]): (Left, Repr) = {
    val left = bfLeft(repr)
    val right = bfRight(repr)
    val it = repr.toIterator
    while (it.hasNext) {
      val next = it.next
      if (!pf.runWith(left += _)(next)) right += next
    }
    left.result -> right.result
  }

  def mapSplit[B,C,Left,Right](f: A => Either[B,C])
  (implicit bfLeft: CanBuildFrom[Repr, B, Left], bfRight: CanBuildFrom[Repr, C, Right]): (Left, Right) = {
    val left = bfLeft(repr)
    val right = bfRight(repr)
    val it = repr.toIterator
    while (it.hasNext) {
      f(it.next) match {
        case Left(next) => left += next
        case Right(next) => right += next
      }
    }
    left.result -> right.result
  }
}

Example usages:

示例用法:

val (syms, ints) =
  Seq(Left('ok), Right(42), Right(666), Left('ko), Right(-1)) mapSplit identity

val ctx = Map('a -> 1, 'b -> 2) map {case(n,v) => n->(n,v)}
val (bound, unbound) = Vector('a, 'a, 'c, 'b) collectPartition ctx
println(bound: Vector[(Symbol, Int)], unbound: Vector[Symbol])

回答by m3th0dman

Something like this could help

这样的事情可能会有所帮助

def partitionMap[IN, A, B](seq: Seq[IN])(function: IN => Either[A, B]): (Seq[A], Seq[B]) = {
  val (eitherLeft, eitherRight) = seq.map(function).partition(_.isLeft)
  eitherLeft.map(_.left.get) -> eitherRight.map(_.right.get)
}

To call it

调用它

val seq: Seq[Any] = Seq(1, "A", 2, "B")
val (ints, strings) = CollectionUtils.partitionMap(seq) {
  case int: Int    => Left(int)
  case str: String => Right(str)
}
ints shouldBe Seq(1, 2)
strings shouldBe Seq("A", "B")

Advantage is a simple API, similar with the one from Scala 2.12

Advantage 是一个简单的 API,类似于 Scala 2.12 的 API

Disadvantage; collection is ran twice and missing support for CanBuildFrom

坏处; 收集运行了两次并且缺少对CanBuildFrom