Scala - 计算 List[SomeObj] 中 SomeObj.double 的平均值

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3498784/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-22 02:21:25  来源:igfitidea点击:

Scala - calculate average of SomeObj.double in a List[SomeObj]

scala

提问by whaley

I'm on my second evening of scala, and I'm resisting the urge to write things in scala how I used to do them in java and trying to learn all of the idioms. In this case I'm looking to just compute an average using such things as closures, mapping, and perhaps list comprehension. Irrespective of whether this is the best way to compute an average, I just want to know how to do these things in scala for learning purposes only

我正在使用 Scala 的第二个晚上,我正在抵制在 Scala 中写东西的冲动,就像我过去在 Java 中做的那样,并试图学习所有的习语。在这种情况下,我只想使用闭包、映射和列表理解等方法来计算平均值。不管这是否是计算平均值的最佳方法,我只想知道如何在 Scala 中做这些事情,仅用于学习目的

Here's an example: the average method below is left pretty much unimplemented. I've got a couple of other methods for looking up the rating an individual userid gave that uses the find method of TraversableLike (I think), but nothing more that is scala specific, really. How would I compute an average given a List[RatingEvent] where RatingEvent.rating is a double value that I'd to compute an average of across all values of that List in a scala-like manner?.

这是一个例子:下面的平均方法几乎没有实现。我有其他几种方法来查找单个用户 ID 使用 TraversableLike 的 find 方法(我认为)给出的评级,但实际上没有更多特定于 Scala 的方法。我将如何计算给定 List[RatingEvent] 的平均值,其中 RatingEvent.rating 是一个双精度值,我会以类似 Scala 的方式计算该 List 的所有值的平均值?。

package com.brinksys.liftnex.model

class Movie(val id : Int, val ratingEvents : List[RatingEvent]) {

    def getRatingByUser(userId : Int) : Int =  {
        return getRatingEventByUserId(userId).rating
    }

    def getRatingEventByUserId(userId : Int) : RatingEvent = {
        var result = ratingEvents find {e => e.userId == userId }
        return result.get
    }

    def average() : Double = {
        /* 
         fill in the blanks where an average of all ratingEvent.rating values is expected
        */
       return 3.8
    }
}

How would a seasoned scala pro fill in that method and use the features of scala to make it as concise as possible? I know how I would do it in java, which is what I want to avoid.

一个经验丰富的 scala 专家将如何填写该方法并使用 scala 的特性使其尽可能简洁?我知道我将如何在 java 中做到这一点,这是我想要避免的。

If I were doing it in python, I assume the most pythonic way would be:

如果我在 python 中做它,我认为最 pythonic 的方式是:

sum([re.rating. for re in ratingEvents]) / len(ratingEvents)

or if I were forcing myself to use a closure (which is something I at least want to learn in scala):

或者如果我强迫自己使用闭包(这是我至少想在 Scala 中学习的东西):

reduce(lambda x, y : x + y, [re.rating for re in ratingEvents]) / len(ratingEvents)

It's the usage of these types of things I want to learn in scala.

我想在scala中学习这些类型的东西的用法。

Your suggestions? Any pointers to good tutorials/reference material relevant to this are welcome :D

你的建议?欢迎任何指向与此相关的良好教程/参考资料的指针:D

回答by Rex Kerr

If you're going to be doing math on things, using Listis not always the fastest way to go because Listhas no idea how long it is--so ratingEvents.lengthtakes time proportional to the length. (Not very muchtime, granted, but it does have to traverse the whole list to tell.) But if you're mostly manipulating data structures and only occasionally need to compute a sum or whatever, so it's not the time-critical core of your code, then using Listis dandy.

如果您要对事物进行数学运算,使用List并不总是最快的方法,因为List不知道它有多长——所以ratingEvents.length时间与长度成正比。(不是很多时间,当然,但它确实必须遍历整个列表才能告诉。)但是如果您主要操作数据结构并且只是偶尔需要计算总和或其他什么,那么它不是时间关键的核心你的代码,然后使用List是花花公子。

Anyway, the canonical way to do it would be with a fold to compute the sum:

无论如何,规范的方法是使用折叠来计算总和:

(0.0 /: ratingEvents){_ + _.rating} / ratingEvents.length

// Equivalently, though more verbosely:
// ratingEvents.foldLeft(0.0)(_ + _.rating) / ratingEvents.length

or by mapping and then summing (2.8 only):

或通过映射然后求和(仅限 2.8):

ratingEvents.map(_.rating).sum / ratingEvents.length

For more information on maps and folds, see this question on that topic.

有关贴图和折叠的更多信息,请参阅有关该主题的问题

回答by Landei

You mightcalculate sum and length in one go, but I doubt that this helps except for verylong lists. It would look like this:

可能会一次性计算出总和和长度,但我怀疑这对非常长的列表是否有帮助。它看起来像这样:

val (s,l) = ratingEvents.foldLeft((0.0, 0))((t, r)=>(t._1 + r.rating, t._2 + 1)) 
val avg = s / l

I think for this example Rex' solution is much better, but in other use cases the "fold-over-tuple-trick" can be essential.

我认为对于这个例子 Rex 的解决方案要好得多,但在其他用例中,“折叠元组技巧”可能是必不可少的。

回答by Holger Brandl

Since meanand other descriptive statistics like standard deviationor medianare needed in different contexts, you could also use a small reusable implicit helper class to allow for more streamlined chained commands:

由于mean和其他描述性统计数据类似standard deviationmedian在不同的上下文中需要,您还可以使用一个小的可重用隐式帮助器类来允许更简化的链式命令:

  implicit class ImplDoubleVecUtils(values: Seq[Double]) {

    def mean = values.sum / values.length
  }

  val meanRating = ratingEvents.map(_.rating).mean

It even seems to bepossible to write this in a generic fashion for all number types.

甚至似乎可以为所有数字类型以通用方式编写此代码。

回答by Artem Vlasenko

I can suggest 2 ways:

我可以建议两种方式:

def average(x: Array[Double]): Double = x.foldLeft(0.0)(_ + _) / x.length

def average(x: Array[Double]): Double = x.sum / x.length

Both are fine, but in 1 case when using fold you can not only make "+" operation, but as well replace it with other (- or * for example)

两者都很好,但是在使用 fold 的一种情况下,您不仅可以进行“+”操作,还可以将其替换为其他(例如 - 或 *)