scala 在折叠中提前中止
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/12892701/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Abort early in a fold
提问by Heptic
What's the best way to terminate a fold early? As a simplified example, imagine I want to sum up the numbers in an Iterable, but if I encounter something I'm not expecting (say an odd number) I might want to terminate. This is a first approximation
提前终止折叠的最佳方法是什么?作为一个简化的例子,假设我想对 中的数字求和Iterable,但是如果我遇到我不期望的东西(比如奇数),我可能想终止。这是第一个近似值
def sumEvenNumbers(nums: Iterable[Int]): Option[Int] = {
nums.foldLeft (Some(0): Option[Int]) {
case (Some(s), n) if n % 2 == 0 => Some(s + n)
case _ => None
}
}
However, this solution is pretty ugly (as in, if I did a .foreach and a return -- it'd be much cleaner and clearer) and worst of all, it traverses the entire iterable even if it encounters a non-even number.
然而,这个解决方案非常丑陋(例如,如果我做了一个 .foreach 和一个返回——它会更清晰和更清晰)而且最糟糕的是,它遍历整个迭代,即使它遇到一个非偶数.
So what would be the best way to write a fold like this, that terminates early? Should I just go and write this recursively, or is there a more accepted way?
那么写这样一个提前终止的折叠的最好方法是什么?我应该递归地写这个,还是有更可接受的方式?
采纳答案by Rex Kerr
My first choice would usually be to use recursion. It is only moderately less compact, is potentially faster (certainly no slower), and in early termination can make the logic more clear. In this case you need nested defs which is a little awkward:
我的第一选择通常是使用递归。它只是稍微不那么紧凑,可能更快(当然不会更慢),并且在早期终止可以使逻辑更加清晰。在这种情况下,您需要嵌套的 defs,这有点尴尬:
def sumEvenNumbers(nums: Iterable[Int]) = {
def sumEven(it: Iterator[Int], n: Int): Option[Int] = {
if (it.hasNext) {
val x = it.next
if ((x % 2) == 0) sumEven(it, n+x) else None
}
else Some(n)
}
sumEven(nums.iterator, 0)
}
My second choice would be to use return, as it keeps everything else intact and you only need to wrap the fold in a defso you have something to return from--in this case, you already have a method, so:
我的第二个选择是使用return,因为它保持其他一切完好无损,您只需要将折叠包裹在 a 中,def这样您就可以返回一些东西——在这种情况下,您已经有了一个方法,所以:
def sumEvenNumbers(nums: Iterable[Int]): Option[Int] = {
Some(nums.foldLeft(0){ (n,x) =>
if ((n % 2) != 0) return None
n+x
})
}
which in this particular case is a lot more compact than recursion (though we got especially unlucky with recursion since we had to do an iterable/iterator transformation). The jumpy control flow is something to avoid when all else is equal, but here it's not. No harm in using it in cases where it's valuable.
在这种特殊情况下,它比递归更紧凑(尽管我们对递归特别不走运,因为我们必须进行可迭代/迭代器转换)。当所有其他条件都相同时,应该避免跳跃式控制流,但这里并非如此。在它有价值的情况下使用它没有坏处。
If I was doing this often and wanted it within the middle of a method somewhere (so I couldn't just use return), I would probably use exception-handling to generate non-local control flow. That is, after all, what it is good at, and error handling is not the only time it's useful. The only trick is to avoid generating a stack trace (which is really slow), and that's easy because the trait NoStackTraceand its child trait ControlThrowablealready do that for you. Scala already uses this internally (in fact, that's how it implements the return from inside the fold!). Let's make our own (can't be nested, though one could fix that):
如果我经常这样做并希望它在某个方法的中间某个地方(所以我不能只使用 return),我可能会使用异常处理来生成非本地控制流。毕竟,这就是它擅长的领域,而且错误处理并不是它唯一有用的地方。唯一的技巧是避免生成堆栈跟踪(这真的很慢),这很容易,因为特征NoStackTrace及其子特征ControlThrowable已经为您做到了。Scala 已经在内部使用了它(实际上,这就是它实现折叠内部返回的方式!)。让我们自己制作(不能嵌套,但可以解决这个问题):
import scala.util.control.ControlThrowable
case class Returned[A](value: A) extends ControlThrowable {}
def shortcut[A](a: => A) = try { a } catch { case Returned(v) => v }
def sumEvenNumbers(nums: Iterable[Int]) = shortcut{
Option(nums.foldLeft(0){ (n,x) =>
if ((x % 2) != 0) throw Returned(None)
n+x
})
}
Here of course using returnis better, but note that you could put shortcutanywhere, not just wrapping an entire method.
这里当然使用return更好,但请注意,您可以放在shortcut任何地方,而不仅仅是包装整个方法。
Next in line for me would be to re-implement fold (either myself or to find a library that does it) so that it could signal early termination. The two natural ways of doing this are to not propagate the value but an Optioncontaining the value, where Nonesignifies termination; or to use a second indicator function that signals completion. The Scalaz lazy fold shown by Kim Stebel already covers the first case, so I'll show the second (with a mutable implementation):
对我来说,接下来是重新实现 fold (无论是我自己还是找到一个这样做的库),以便它可以发出提前终止的信号。这样做的两种自然方式是不传播值而是Option包含值,其中None表示终止;或使用指示完成的第二个指示符函数。Kim Stebel 展示的 Scalaz 懒惰折叠已经涵盖了第一种情况,所以我将展示第二种情况(使用可变实现):
def foldOrFail[A,B](it: Iterable[A])(zero: B)(fail: A => Boolean)(f: (B,A) => B): Option[B] = {
val ii = it.iterator
var b = zero
while (ii.hasNext) {
val x = ii.next
if (fail(x)) return None
b = f(b,x)
}
Some(b)
}
def sumEvenNumbers(nums: Iterable[Int]) = foldOrFail(nums)(0)(_ % 2 != 0)(_ + _)
(Whether you implement the termination by recursion, return, laziness, etc. is up to you.)
(是否通过递归、返回、懒惰等方式实现终止由你决定。)
I think that covers the main reasonable variants; there are some other options also, but I'm not sure why one would use them in this case. (Iteratoritself would work well if it had a findOrPrevious, but it doesn't, and the extra work it takes to do that by hand makes it a silly option to use here.)
我认为这涵盖了主要的合理变体;还有一些其他选择,但我不确定为什么在这种情况下会使用它们。(Iterator如果它有一个findOrPrevious,它本身会工作得很好,但它没有,并且手动执行此操作所需的额外工作使其在这里使用成为一个愚蠢的选择。)
回答by Dylan
The scenario you describe (exit upon some unwanted condition) seems like a good use case for the takeWhilemethod. It is essentially filter, but should end upon encountering an element that doesn't meet the condition.
您描述的场景(在一些不需要的情况下退出)似乎是该takeWhile方法的一个很好的用例。它本质上是filter,但应该在遇到不满足条件的元素时结束。
For example:
例如:
val list = List(2,4,6,8,6,4,2,5,3,2)
list.takeWhile(_ % 2 == 0) //result is List(2,4,6,8,6,4,2)
This will work just fine for Iterators/Iterables too. The solution I suggest for your "sum of even numbers, but break on odd" is:
这也适用于Iterators/ Iterables。我为您的“偶数总和,但奇数中断”建议的解决方案是:
list.iterator.takeWhile(_ % 2 == 0).foldLeft(...)
And just to prove that it's not wasting your time once it hits an odd number...
并且只是为了证明它一旦达到奇数就不会浪费您的时间......
scala> val list = List(2,4,5,6,8)
list: List[Int] = List(2, 4, 5, 6, 8)
scala> def condition(i: Int) = {
| println("processing " + i)
| i % 2 == 0
| }
condition: (i: Int)Boolean
scala> list.iterator.takeWhile(condition _).sum
processing 2
processing 4
processing 5
res4: Int = 6
回答by Kim Stebel
You can do what you want in a functional style using the lazy version of foldRight in scalaz. For a more in depth explanation, see this blog post. While this solution uses a Stream, you can convert an Iterableinto a Streamefficiently with iterable.toStream.
您可以使用 scalaz 中 foldRight 的惰性版本以函数式风格做您想做的事。有关更深入的解释,请参阅此博客文章。虽然此解决方案使用Stream,但您可以Iterable使用Stream有效地将 转换为iterable.toStream。
import scalaz._
import Scalaz._
val str = Stream(2,1,2,2,2,2,2,2,2)
var i = 0 //only here for testing
val r = str.foldr(Some(0):Option[Int])((n,s) => {
println(i)
i+=1
if (n % 2 == 0) s.map(n+) else None
})
This only prints
这仅打印
0
1
which clearly shows that the anonymous function is only called twice (i.e. until it encounters the odd number). That is due to the definition of foldr, whose signature (in case of Stream) is def foldr[B](b: B)(f: (Int, => B) => B)(implicit r: scalaz.Foldable[Stream]): B. Note that the anonymous function takes a by name parameter as its second argument, so it need no be evaluated.
这清楚地表明匿名函数只被调用两次(即直到它遇到奇数)。这是由于 foldr 的定义,其签名(在 的情况下Stream)是def foldr[B](b: B)(f: (Int, => B) => B)(implicit r: scalaz.Foldable[Stream]): B。请注意,匿名函数将按名称参数作为其第二个参数,因此不需要对其进行评估。
Btw, you can still write this with the OP's pattern matching solution, but I find if/else and map more elegant.
顺便说一句,您仍然可以使用 OP 的模式匹配解决方案来编写它,但我发现 if/else 和 map 更优雅。
回答by missingfaktor
Well, Scala does allow non local returns. There are differing opinions on whether or not this is a good style.
好吧,Scala 确实允许非本地返回。对于这种风格是否好,众说纷纭。
scala> def sumEvenNumbers(nums: Iterable[Int]): Option[Int] = {
| nums.foldLeft (Some(0): Option[Int]) {
| case (None, _) => return None
| case (Some(s), n) if n % 2 == 0 => Some(s + n)
| case (Some(_), _) => None
| }
| }
sumEvenNumbers: (nums: Iterable[Int])Option[Int]
scala> sumEvenNumbers(2 to 10)
res8: Option[Int] = None
scala> sumEvenNumbers(2 to 10 by 2)
res9: Option[Int] = Some(30)
EDIT:
编辑:
In this particular case, as @Arjan suggested, you can also do:
在这种特殊情况下,正如@Arjan 建议的那样,您还可以执行以下操作:
def sumEvenNumbers(nums: Iterable[Int]): Option[Int] = {
nums.foldLeft (Some(0): Option[Int]) {
case (Some(s), n) if n % 2 == 0 => Some(s + n)
case _ => return None
}
}
回答by Didac Montero
Catshas a method called foldMwhich does short-circuiting (for Vector, List, Stream, ...).
Cats有一种称为foldM的方法可以进行短路(对于Vector, List, Stream, ...)。
It works as follows:
它的工作原理如下:
def sumEvenNumbers(nums: Stream[Int]): Option[Long] = {
import cats.implicits._
nums.foldM(0L) {
case (acc, c) if c % 2 == 0 => Some(acc + c)
case _ => None
}
}
As soon as one of the elements on the collection is not even, it returns.
只要集合中的元素之一不偶数,它就会返回。
回答by rozky
You can use foldMfrom cats lib (as suggested by @Didac) but I suggest to use Eitherinstead of Optionif you want to get actual sum out.
您可以使用foldM来自cats lib(如@Didac 的建议),但如果您想获得实际总和,我建议使用Either而不是使用Option。
bifoldMapis used to extract the result from Either.
bifoldMap用于从 中提取结果Either。
import cats.implicits._
def sumEven(nums: Stream[Int]): Either[Int, Int] = {
nums.foldM(0) {
case (acc, n) if n % 2 == 0 => Either.right(acc + n)
case (acc, n) => {
println(s"Stopping on number: $n")
Either.left(acc)
}
}
}
examples:
例子:
println("Result: " + sumEven(Stream(2, 2, 3, 11)).bifoldMap(identity, identity))
> Stopping on number: 3
> Result: 4
println("Result: " + sumEven(Stream(2, 7, 2, 3)).bifoldMap(identity, identity))
> Stopping on number: 7
> Result: 2
回答by seagull1089
You could try using a temporary var and using takeWhile. Here is a version.
您可以尝试使用临时变量并使用 takeWhile。这是一个版本。
var continue = true
// sample stream of 2's and then a stream of 3's.
val evenSum = (Stream.fill(10)(2) ++ Stream.fill(10)(3)).takeWhile(_ => continue)
.foldLeft(Option[Int](0)){
case (result,i) if i%2 != 0 =>
continue = false;
// return whatever is appropriate either the accumulated sum or None.
result
case (optionSum,i) => optionSum.map( _ + i)
}
The evenSumshould be Some(20)in this case.
本evenSum应Some(20)在这种情况下。
回答by Core
@Rex Kerr your answer helped me, but I needed to tweak it to use Either
@Rex Kerr 你的回答对我有帮助,但我需要调整它才能使用
def foldOrFail[A,B,C,D](map: B => Either[D, C])(merge: (A, C) => A)(initial: A)(it: Iterable[B]): Either[D, A] = {
val ii= it.iterator
var b= initial
while (ii.hasNext) {
val x= ii.next
map(x) match {
case Left(error) => return Left(error)
case Right(d) => b= merge(b, d)
}
}
Right(b)
}
回答by ozma
Just for an "academic" reasons (:
只是出于“学术”原因(:
var headers = Source.fromFile(file).getLines().next().split(",")
var closeHeaderIdx = headers.takeWhile { s => !"Close".equals(s) }.foldLeft(0)((i, S) => i+1)
Takes twice then it should but it is a nice one liner. If "Close" not found it will return
需要两次,但它应该是一个很好的单衬。如果未找到“关闭”,它将返回
headers.size
Another (better) is this one:
另一个(更好)是这个:
var headers = Source.fromFile(file).getLines().next().split(",").toList
var closeHeaderIdx = headers.indexOf("Close")
回答by Arjan
A more beutiful solution would be using span:
一个更漂亮的解决方案是使用 span:
val (l, r) = numbers.span(_ % 2 == 0)
if(r.isEmpty) Some(l.sum)
else None
... but it traverses the list two times if all the numbers are even
...但如果所有数字都是偶数,它会遍历列表两次

