scala 合并两个地图并对相同键的值求和的最佳方法？

Question

提问by Freewind

val map1 = Map(1 -> 9 , 2 -> 20)
val map2 = Map(1 -> 100, 3 -> 300)

I want to merge them, and sum the values of same keys. So the result will be:

我想合并它们，并对相同键的值求和。所以结果将是：

Map(2->20, 1->109, 3->300)

Now I have 2 solutions:

现在我有两个解决方案：

val list = map1.toList ++ map2.toList
val merged = list.groupBy ( _._1) .map { case (k,v) => k -> v.map(_._2).sum }

and

和

val merged = (map1 /: map2) { case (map, (k,v)) =>
    map + ( k -> (v + map.getOrElse(k, 0)) )
}

But I want to know if there are any better solutions.

但我想知道是否有更好的解决方案。

Answer 1

采纳答案by Andrzej Doyle

Scalazhas the concept of a Semigroupwhich captures what you want to do here, and leads to arguably the shortest/cleanest solution:

Scalaz有一个概念半群什么你想在这里做的捕获，并导致无疑是最短/干净的解决方案：

scala> import scalaz._
import scalaz._

scala> import Scalaz._
import Scalaz._

scala> val map1 = Map(1 -> 9 , 2 -> 20)
map1: scala.collection.immutable.Map[Int,Int] = Map(1 -> 9, 2 -> 20)

scala> val map2 = Map(1 -> 100, 3 -> 300)
map2: scala.collection.immutable.Map[Int,Int] = Map(1 -> 100, 3 -> 300)

scala> map1 |+| map2
res2: scala.collection.immutable.Map[Int,Int] = Map(1 -> 109, 3 -> 300, 2 -> 20)

Specifically, the binary operator for Map[K, V]combines the keys of the maps, folding V's semigroup operator over any duplicate values. The standard semigroup for Intuses the addition operator, so you get the sum of values for each duplicate key.

具体来说，二元运算符用于Map[K, V]组合映射的键，V在任何重复值上折叠的半群运算符。标准半群 forInt使用加法运算符，因此您可以获得每个重复键的值总和。

Edit: A little more detail, as per user482745's request.

编辑：更多细节，根据 user482745 的要求。

Mathematically a semigroupis just a set of values, together with an operator that takes two values from that set, and produces another value from that set. So integers under addition are a semigroup, for example - the +operator combines two ints to make another int.

从数学上讲，半群只是一组值，以及一个从该集合中获取两个值并从该集合中产生另一个值的运算符。因此，加法下的整数是一个半群，例如 -+运算符将两个整数组合成另一个整数。

You can also define a semigroup over the set of "all maps with a given key type and value type", so long as you can come up with some operation that combines two maps to produce a new one which is somehow the combination of the two inputs.

您还可以在“具有给定键类型和值类型的所有映射”的集合上定义一个半群，只要您能提出一些将两个映射组合起来以产生一个新的映射的操作，该新映射以某种方式是两者的组合输入。

If there are no keys that appear in both maps, this is trivial. If the same key exists in both maps, then we need to combine the two values that the key maps to. Hmm, haven't we just described an operator which combines two entities of the same type? This is why in Scalaz a semigroup for Map[K, V]exists if and only if a Semigroup for Vexists - V's semigroup is used to combine the values from two maps which are assigned to the same key.

如果在两个映射中都没有键出现，这很简单。如果两个映射中存在相同的键，那么我们需要组合键映射到的两个值。嗯，我们不是刚刚描述了一个运算符，它结合了两个相同类型的实体吗？这就是为什么在 Scalaz 中Map[K, V]存在的半群 for当且仅当V存在V的半群-的半群用于组合来自分配给同一键的两个映射的值。

So because Intis the value type here, the "collision" on the 1key is resolved by integer addition of the two mapped values (as that's what Int's semigroup operator does), hence 100 + 9. If the values had been Strings, a collision would have resulted in string concatenation of the two mapped values (again, because that's what the semigroup operator for String does).

所以因为Int是这里的值类型，1键上的“冲突”是通过两个映射值的整数相加来解决的（这就是 Int 的半群运算符所做的），因此100 + 9. 如果值是字符串，冲突将导致两个映射值的字符串连接（同样，因为这是 String 的半群运算符所做的）。

(And interestingly, because string concatenation is not commutative - that is, "a" + "b" != "b" + "a"- the resulting semigroup operation isn't either. So map1 |+| map2is different from map2 |+| map1in the String case, but not in the Int case.)

（有趣的是，因为字符串连接是不可交换的-那就是"a" + "b" != "b" + "a"-生成的半群操作也不那么，map1 |+| map2是从不同map2 |+| map1的字符串的情况下，而不是在诠释情况。）

Answer 2

回答by Rex Kerr

The shortest answer I know of that uses only the standard library is

我所知道的仅使用标准库的最短答案是

map1 ++ map2.map{ case (k,v) => k -> (v + map1.getOrElse(k,0)) }

Answer 3

回答by Matthew Farwell

Quick solution:

快速解决方案：

(map1.keySet ++ map2.keySet).map {i=> (i,map1.getOrElse(i,0) + map2.getOrElse(i,0))}.toMap

Answer 4

回答by Mikhail Golubtsov

Well, now in scala library (at least in 2.10) there is something you wanted - mergedfunction. BUT it's presented only in HashMap not in Map. It's somewhat confusing. Also the signature is cumbersome - can't imagine why I'd need a key twice and when I'd need to produce a pair with another key. But nevertheless, it works and much cleaner than previous "native" solutions.

好吧，现在在 Scala 库中（至少在 2.10 中）有您想要的东西 -合并函数。但它仅在 HashMap 中而不在 Map 中出现。这有点令人困惑。签名也很麻烦 - 无法想象为什么我需要两次密钥以及何时需要用另一个密钥生成一对。但是尽管如此，它仍然有效，而且比以前的“本机”解决方案要干净得多。

val map1 = collection.immutable.HashMap(1 -> 11 , 2 -> 12)
val map2 = collection.immutable.HashMap(1 -> 11 , 2 -> 12)
map1.merged(map2)({ case ((k,v1),(_,v2)) => (k,v1+v2) })

Also in scaladoc mentioned that

同样在 scaladoc 中提到

The mergedmethod is on average more performant than doing a traversal and reconstructing a new immutable hash map from scratch, or ++.

该merged方法平均比从头开始遍历并重建新的不可变哈希图或++.

Answer 5

回答by Jegan

This can be implemented as a Monoidwith just plain Scala. Here is a sample implementation. With this approach, we can merge not just 2, but a list of maps.

这可以用简单的 Scala实现为Monoid。这是一个示例实现。使用这种方法，我们不仅可以合并 2 个地图，还可以合并一个地图列表。

// Monoid trait

trait Monoid[M] {
  def zero: M
  def op(a: M, b: M): M
}

The Map based implementation of the Monoid trait that merges two maps.

基于 Map 的 Monoid trait 实现合并两个地图。

val mapMonoid = new Monoid[Map[Int, Int]] {
  override def zero: Map[Int, Int] = Map()

  override def op(a: Map[Int, Int], b: Map[Int, Int]): Map[Int, Int] =
    (a.keySet ++ b.keySet) map { k => 
      (k, a.getOrElse(k, 0) + b.getOrElse(k, 0))
    } toMap
}

Now, if you have a list of maps that needs to be merged (in this case, only 2), it can be done like below.

现在，如果您有一个需要合并的地图列表（在这种情况下，只有 2 个），可以按如下方式完成。

val map1 = Map(1 -> 9 , 2 -> 20)
val map2 = Map(1 -> 100, 3 -> 300)

val maps = List(map1, map2) // The list can have more maps.

val merged = maps.foldLeft(mapMonoid.zero)(mapMonoid.op)

Answer 6

回答by Artsiom Miklushou

You can also do that with Cats.

你也可以用Cats做到这一点。

import cats.implicits._

val map1 = Map(1 -> 9 , 2 -> 20)
val map2 = Map(1 -> 100, 3 -> 300)

map1 combine map2 // Map(2 -> 20, 1 -> 109, 3 -> 300)

Answer 7

回答by AmigoNico

map1 ++ ( for ( (k,v) <- map2 ) yield ( k -> ( v + map1.getOrElse(k,0) ) ) )

Answer 8

回答by Nimrod007

I wrote a blog post about this , check it out :

我写了一篇关于此的博客文章，请查看：

http://www.nimrodstech.com/scala-map-merge/

basically using scalaz semi group you can achieve this pretty easily

基本上使用 scalaz semi group 你可以很容易地做到这一点

would look something like :

看起来像：

  import scalaz.Scalaz._
  map1 |+| map2

Answer 9

回答by Xavier Guihot

Starting Scala 2.13, another solution only based on the standard library consists in replacing the groupBypart of your solution with groupMapReducewhich (as its name suggests) is an equivalent of a groupByfollowed by mapValuesand a reduce step:

开始Scala 2.13，另一个仅基于标准库的解决方案包括替换groupBy您的解决方案的一部分groupMapReduce（顾名思义）与groupBy后跟mapValues和减少步骤等价：

// val map1 = Map(1 -> 9, 2 -> 20)
// val map2 = Map(1 -> 100, 3 -> 300)
(map1.toSeq ++ map2).groupMapReduce(_._1)(_._2)(_+_)
// Map[Int,Int] = Map(2 -> 20, 1 -> 109, 3 -> 300)

This:

这：

Concatenates the two maps as a sequence of tuples (List((1,9), (2,20), (1,100), (3,300))). For conciseness, map2is implicitlyconverted to Seqto adapt to the type of map1.toSeq- but you could choose to make it explicit by using map2.toSeq,
groups elements based on their first tuple part (group part of groupMapReduce),
maps grouped values to their second tuple part (map part of groupMapReduce),
reduces mapped values (_+_) by summing them (reduce part of groupMapReduce).

将两个映射连接为一个元组序列 ( List((1,9), (2,20), (1,100), (3,300)))。为了简洁，map2被隐式转换为Seq适应类型map1.toSeq-但你可以选择，使其明确使用map2.toSeq，
groups 元素基于它们的第一个元组部分（组MapReduce 的组部分），
maps 将值分组到它们的第二个元组部分（组MapReduce 的映射部分），
reduces 映射值 ( _+_) 通过对它们求和（减少 groupMap Reduce 的一部分）。

Answer 10

回答by user1084563

Here's what I ended up using:

这是我最终使用的：

(a.toSeq ++ b.toSeq).groupBy(_._1).mapValues(_.map(_._2).sum)

scala 合并两个地图并对相同键的值求和的最佳方法？

提问by Freewind

采纳答案by Andrzej Doyle

回答by Rex Kerr

回答by Matthew Farwell

回答by Mikhail Golubtsov

回答by Jegan

回答by Artsiom Miklushou

回答by AmigoNico

回答by Nimrod007

回答by Xavier Guihot

回答by user1084563

相关推荐

最近更新

标签

scala 合并两个地图并对相同键的值求和的最佳方法？

提问by Freewind

采纳答案by Andrzej Doyle

回答by Rex Kerr

回答by Matthew Farwell

回答by Mikhail Golubtsov

回答by Jegan

回答by Artsiom Miklushou

回答by AmigoNico

回答by Nimrod007

回答by Xavier Guihot

回答by user1084563

相关推荐

列表以外的序列上的 Scala 模式匹配

理解 Scala 解析器组合器中的波浪号

为什么 foreach 比 get for Scala Options 更好？

未来投资：Erlang 与 Scala

相关推荐

最近更新

标签