Scala：如何合并地图集合

Question

提问by Jeff

I have a List of Map[String, Double], and I'd like to merge their contents into a single Map[String, Double]. How should I do this in an idiomatic way? I imagine that I should be able to do this with a fold. Something like:

我有一个 Map[String, Double] 列表，我想将它们的内容合并到一个 Map[String, Double] 中。我应该如何以惯用的方式做到这一点？我想我应该能够通过折叠来做到这一点。就像是：

val newMap = Map[String, Double]() /: listOfMaps { (accumulator, m) => ... }

Furthermore, I'd like to handle key collisions in a generic way. That is, if I add a key to the map that already exists, I should be able to specify a function that returns a Double (in this case) and takes the existing value for that key, plus the value I'm trying to add. If the key does not yet exist in the map, then just add it and its value unaltered.

此外，我想以通用的方式处理键冲突。也就是说，如果我向已经存在的地图添加一个键，我应该能够指定一个返回 Double（在这种情况下）的函数并获取该键的现有值，加上我尝试添加的值. 如果映射中尚不存在该键，则只需添加它并保持其值不变。

In my specific case I'd like to build a single Map[String, Double] such that if the map already contains a key, then the Double will be added to the existing map value.

在我的特定情况下，我想构建一个 Map[String, Double] 这样如果地图已经包含一个键，那么 Double 将被添加到现有的地图值中。

I'm working with mutable maps in my specific code, but I'm interested in more generic solutions, if possible.

我正在我的特定代码中使用可变映射，但如果可能的话，我对更通用的解决方案感兴趣。

Answer 1

采纳答案by Walter Chang

How about this one:

这个怎么样：

def mergeMap[A, B](ms: List[Map[A, B]])(f: (B, B) => B): Map[A, B] =
  (Map[A, B]() /: (for (m <- ms; kv <- m) yield kv)) { (a, kv) =>
    a + (if (a.contains(kv._1)) kv._1 -> f(a(kv._1), kv._2) else kv)
  }

val ms = List(Map("hello" -> 1.1, "world" -> 2.2), Map("goodbye" -> 3.3, "hello" -> 4.4))
val mm = mergeMap(ms)((v1, v2) => v1 + v2)

println(mm) // prints Map(hello -> 5.5, world -> 2.2, goodbye -> 3.3)

And it works in both 2.7.5 and 2.8.0.

它适用于 2.7.5 和 2.8.0。

Answer 2

回答by Daniel C. Sobral

Well, you could do:

好吧，你可以这样做：

mapList reduce (_ ++ _)

except for the special requirement for collision.

碰撞的特殊要求除外。

Since you do have that special requirement, perhaps the best would be doing something like this (2.8):

由于您确实有这种特殊要求，也许最好的方法是做这样的事情 (2.8)：

def combine(m1: Map, m2: Map): Map = {
  val k1 = Set(m1.keysIterator.toList: _*)
  val k2 = Set(m2.keysIterator.toList: _*)
  val intersection = k1 & k2

  val r1 = for(key <- intersection) yield (key -> (m1(key) + m2(key)))
  val r2 = m1.filterKeys(!intersection.contains(_)) ++ m2.filterKeys(!intersection.contains(_)) 
  r2 ++ r1
}

You can then add this method to the map class through the Pimp My Library pattern, and use it in the original example instead of "++":

然后可以通过 Pimp My Library 模式将此方法添加到地图类中，并在原始示例中使用它而不是“ ++”：

class CombiningMap(m1: Map[Symbol, Double]) {
  def combine(m2: Map[Symbol, Double]) = {
    val k1 = Set(m1.keysIterator.toList: _*)
    val k2 = Set(m2.keysIterator.toList: _*)
    val intersection = k1 & k2
    val r1 = for(key <- intersection) yield (key -> (m1(key) + m2(key)))
    val r2 = m1.filterKeys(!intersection.contains(_)) ++ m2.filterKeys(!intersection.contains(_))
    r2 ++ r1
  }
}

// Then use this:
implicit def toCombining(m: Map[Symbol, Double]) = new CombiningMap(m)

// And finish with:
mapList reduce (_ combine _)

While this was written in 2.8, so keysIteratorbecomes keysfor 2.7, filterKeysmight need to be written in terms of filterand map, &becomes **, and so on, it shouldn't be too different.

虽然这是在 2.8 中编写的，但对于 2.7 来说keysIterator变成keys了，filterKeys可能需要根据filterand map、&become 等来编写**，它不应该有太大的不同。

Answer 3

回答by Electric Coffee

I'm surprised no one's come up with this solution yet:

我很惊讶还没有人想出这个解决方案：

myListOfMaps.flatten.toMap

Does exactly what you need:

正是您所需要的：

Merges the list to a single Map
Weeds out any duplicate keys

将列表合并到一个 Map
清除任何重复的键

Example:

例子：

scala> List(Map('a -> 1), Map('b -> 2), Map('c -> 3), Map('a -> 4, 'b -> 5)).flatten.toMap
res7: scala.collection.immutable.Map[Symbol,Int] = Map('a -> 4, 'b -> 5, 'c -> 3)

flattenturns the list of maps into a flat list of tuples, toMapturns the list of tuples into a map with all the duplicate keys removed

flatten将映射列表转换为元组的平面列表，将元组toMap列表转换为删除所有重复键的映射

Answer 4

回答by huynhjl

I reading this question quickly so I'm not sure if I'm missing something (like it has to work for 2.7.x or no scalaz):

我很快阅读了这个问题，所以我不确定我是否遗漏了什么（比如它必须适用于 2.7.x 或没有 scalaz）：

import scalaz._
import Scalaz._
val ms = List(Map("hello" -> 1.1, "world" -> 2.2), Map("goodbye" -> 3.3, "hello" -> 4.4))
ms.reduceLeft(_ |+| _)
// returns Map(goodbye -> 3.3, hello -> 5.5, world -> 2.2)

You can change the monoid definition for Double and get another way to accumulate the values, here getting the max:

您可以更改 Double 的幺半群定义并获得另一种累积值的方法，此处获取最大值：

implicit val dbsg: Semigroup[Double] = semigroup((a,b) => math.max(a,b))
ms.reduceLeft(_ |+| _)
// returns Map(goodbye -> 3.3, hello -> 4.4, world -> 2.2)

Answer 5

回答by Jeff

Interesting, noodling around with this a bit, I got the following (on 2.7.5):

有趣的是，稍微讨论一下，我得到了以下信息（在 2.7.5 上）：

General Maps:

一般地图：

   def mergeMaps[A,B](collisionFunc: (B,B) => B)(listOfMaps: Seq[scala.collection.Map[A,B]]): Map[A, B] = {
    listOfMaps.foldLeft(Map[A, B]()) { (m, s) =>
      Map(
        s.projection.map { pair =>
        if (m contains pair._1)
          (pair._1, collisionFunc(m(pair._1), pair._2))
        else
          pair
      }.force.toList:_*)
    }
  }

But man, that is hideous with the projection and forcing and toList and whatnot. Separate question: what's a better way to deal with that within the fold?

但是伙计，投影和强制以及 toList 之类的东西很可怕。单独的问题：在折叠内处理这个问题的更好方法是什么？

For mutable Maps, which is what I was dealing with in my code, and with a less general solution, I got this:

对于可变地图，这是我在我的代码中处理的，并且使用了一个不太通用的解决方案，我得到了这个：

def mergeMaps[A,B](collisionFunc: (B,B) => B)(listOfMaps: List[mutable.Map[A,B]]): mutable.Map[A, B] = {
    listOfMaps.foldLeft(mutable.Map[A,B]()) {
      (m, s) =>
      for (k <- s.keys) {
        if (m contains k)
          m(k) = collisionFunc(m(k), s(k))
        else
          m(k) = s(k)
      }
      m
    }
  }

That seems a little bit cleaner, but will only work with mutable Maps as it's written. Interestingly, I first tried the above (before I asked the question) using /: instead of foldLeft, but I was getting type errors. I thought /: and foldLeft were basically equivalent, but the compiler kept complaining that I needed explicit types for (m, s). What's up with that?

这看起来更简洁一些，但仅适用于编写的可变 Maps。有趣的是，我首先使用 /: 而不是 foldLeft 尝试了上述操作（在我提出问题之前），但我遇到了类型错误。我认为 /: 和 foldLeft 基本上是等价的，但是编译器一直抱怨我需要 (m, s) 的显式类型。那是怎么回事？

Answer 6

回答by Xavier Guihot

Starting Scala 2.13, another solution which handles duplicate keysand is only based on the standard libraryconsists in merging the Maps as sequences (flatten) before applying the new groupMapReduceoperator which (as its name suggests) is an equivalent of a groupByfollowed by a mapping and a reduce step of grouped values:

开始Scala 2.13，另一种处理重复键且仅基于标准库的解决方案包括在应用新的groupMapReduce运算符之前将Maps合并为序列 ( flatten)，该运算符（顾名思义）相当于 a后跟映射和化简分组值的步骤：groupBy

List(Map("hello" -> 1.1, "world" -> 2.2), Map("goodbye" -> 3.3, "hello" -> 4.4))
  .flatten
  .groupMapReduce(_._1)(_._2)(_ + _)
// Map("world" -> 2.2, "goodbye" -> 3.3, "hello" -> 5.5)

This:

这：

flattens (concatenates) the maps as a sequence of tuples (List(("hello", 1.1), ("world", 2.2), ("goodbye", 3.3), ("hello", 4.4))), which keeps all key/values (even duplicate keys)
groups elements based on their first tuple part (_._1) (group part of groupMapReduce)
maps grouped values to their second tuple part (_._2) (map part of groupMapReduce)
reduces mapped grouped values (_+_) by taking their sum (but it can be any reduce: (T, T) => Tfunction) (reduce part of groupMapReduce)

flattens（连接）映射为元组序列 ( List(("hello", 1.1), ("world", 2.2), ("goodbye", 3.3), ("hello", 4.4)))，保留所有键/值（甚至重复键）
groups 个元素基于它们的第一个元组部分 ( _._1) （组MapReduce 的组部分）
maps 将值分组到它们的第二个元组部分 ( _._2) （组MapReduce 的映射部分）
reduces_+_通过取它们的总和来映射分组值 ( )（但它可以是任何reduce: (T, T) => T函数）（减少 groupMap Reduce 的一部分）

The groupMapReducestep can be seen as a one-pass versionequivalent of:

该groupMapReduce步骤可以看作是一个一次性版本，相当于：

list.groupBy(_._1).mapValues(_.map(_._2).reduce(_ + _))

Answer 7

回答by Nimrod007

I wrote a blog post about this , check it out :

我写了一篇关于此的博客文章，请查看：

http://www.nimrodstech.com/scala-map-merge/

basically using scalaz semi group you can achieve this pretty easily

基本上使用 scalaz semi group 你可以很容易地做到这一点

would look something like :

看起来像：

  import scalaz.Scalaz._
  listOfMaps reduce(_ |+| _)

Answer 8

回答by bernstein

a oneliner helper-func, whose usage reads almost as clean as using scalaz:

一个 oneliner helper-func，它的用法读起来几乎和使用 scalaz 一样干净：

def mergeMaps[K,V](m1: Map[K,V], m2: Map[K,V])(f: (V,V) => V): Map[K,V] =
    (m1 -- m2.keySet) ++ (m2 -- m1.keySet) ++ (for (k <- m1.keySet & m2.keySet) yield { k -> f(m1(k), m2(k)) })

val ms = List(Map("hello" -> 1.1, "world" -> 2.2), Map("goodbye" -> 3.3, "hello" -> 4.4))
ms.reduceLeft(mergeMaps(_,_)(_ + _))
// returns Map(goodbye -> 3.3, hello -> 5.5, world -> 2.2)

for ultimate readability wrap it in an implicit custom type:

为了最终的可读性，将其包装在隐式自定义类型中：

class MyMap[K,V](m1: Map[K,V]) {
    def merge(m2: Map[K,V])(f: (V,V) => V) =
    (m1 -- m2.keySet) ++ (m2 -- m1.keySet) ++ (for (k <- m1.keySet & m2.keySet) yield { k -> f(m1(k), m2(k)) })
}
implicit def toMyMap[K,V](m: Map[K,V]) = new MyMap(m)

val ms = List(Map("hello" -> 1.1, "world" -> 2.2), Map("goodbye" -> 3.3, "hello" -> 4.4))
ms reduceLeft { _.merge(_)(_ + _) }

Scala：如何合并地图集合

提问by Jeff

采纳答案by Walter Chang

回答by Daniel C. Sobral

回答by Electric Coffee

回答by huynhjl

回答by Jeff

回答by Xavier Guihot

回答by Nimrod007

回答by bernstein

相关推荐

最近更新

标签

Scala：如何合并地图集合

提问by Jeff

采纳答案by Walter Chang

回答by Daniel C. Sobral

回答by Electric Coffee

回答by huynhjl

回答by Jeff

回答by Xavier Guihot

回答by Nimrod007

回答by bernstein

相关推荐

Scala 的收益率是多少？

Scala 中的函数式响应式编程

Scala Iterable#map vs. Iterable#flatMap

Scala 编程语言的目的是什么？

相关推荐

最近更新

标签