scala 将元组列表转换为映射（并处理重复键？）

Question

提问by Tg.

I was thinking about a nice way to convert a List of tuple with duplicate key [("a","b"),("c","d"),("a","f")]into map ("a" -> ["b", "f"], "c" -> ["d"]). Normally (in python), I'd create an empty map and for-loop over the list and check for duplicate key. But I am looking for something more scala-ish and clever solution here.

我正在考虑一种将具有重复键的元组列表转换[("a","b"),("c","d"),("a","f")]为 map的好方法("a" -> ["b", "f"], "c" -> ["d"])。通常（在python中），我会在列表上创建一个空映射和for循环并检查重复键。但我在这里寻找更像斯卡拉式和聪明的解决方案。

btw, actual type of key-value I use here is (Int, Node)and I want to turn into a map of (Int -> NodeSeq)

顺便说一句，我在这里使用的键值的实际类型是(Int, Node)，我想变成一个映射(Int -> NodeSeq)

Answer 1

采纳答案by om-nom-nom

Group and then project:

分组然后项目：

scala> val x = List("a" -> "b", "c" -> "d", "a" -> "f")
//x: List[(java.lang.String, java.lang.String)] = List((a,b), (c,d), (a,f))
scala> x.groupBy(_._1).map { case (k,v) => (k,v.map(_._2))}
//res1: scala.collection.immutable.Map[java.lang.String,List[java.lang.String]] = Map(c -> List(d), a -> List(b, f))

More scalish way to use fold, in the way like there(skip map fstep).

使用 fold 的更简洁的方式，就像那里的方式（跳过map f步骤）。

Answer 2

回答by Cory Klein

For Googlers that don't expect duplicates or are fine with the default duplicate handling policy:

对于不希望有重复或对默认重复处理策略没问题的 Google 员工：

List("a" -> 1, "b" -> 2).toMap
// Result: Map(a -> 1, c -> 2)

As of 2.12, the default policy reads:

从 2.12 开始，默认策略为：

Duplicate keys will be overwritten by later keys: if this is an unordered collection, which key is in the resulting map is undefined.

重复的键将被后面的键覆盖：如果这是一个无序集合，则结果映射中的哪个键是未定义的。

Answer 3

回答by Daniel C. Sobral

Here's another alternative:

这是另一种选择：

x.groupBy(_._1).mapValues(_.map(_._2))

Answer 4

回答by pathikrit

For Googlers that do care about duplicates:

对于关心重复的 Google 员工：

implicit class Pairs[A, B](p: List[(A, B)]) {
  def toMultiMap: Map[A, List[B]] = p.groupBy(_._1).mapValues(_.map(_._2))
}

> List("a" -> "b", "a" -> "c", "d" -> "e").toMultiMap
> Map("a" -> List("b", "c"), "d" -> List("e"))

Answer 5

回答by Xavier Guihot

Starting Scala 2.13, most collections are provided with the groupMapmethod which is (as its name suggests) an equivalent (more efficient) of a groupByfollowed by mapValues:

首先Scala 2.13，大多数集合都提供了groupMap方法，该方法（顾名思义）是 a 的等效（更有效）groupBy后跟mapValues：

List("a" -> "b", "c" -> "d", "a" -> "f").groupMap(_._1)(_._2)
// Map[String,List[String]] = Map(a -> List(b, f), c -> List(d))

This:

这：

groups elements based on the first part of tuples (group part of groupMap)
maps grouped values by taking their second tuple part (map part of groupMap)

groups 元素基于元组的第一部分（组Map 的组部分）
map通过取它们的第二元组部S分组的值（映射组的一部分地图）

This is an equivalent of list.groupBy(_._1).mapValues(_.map(_._2))but performed in one passthrough the List.

这相当于list.groupBy(_._1).mapValues(_.map(_._2))但在一次遍历列表中执行。

Answer 6

回答by Melcom van Eeden

Below you can find a few solutions. (GroupBy, FoldLeft, Aggregate, Spark)

您可以在下面找到一些解决方案。（GroupBy、FoldLeft、聚合、Spark）

val list: List[(String, String)] = List(("a","b"),("c","d"),("a","f"))

GroupBy variation

按变体分组

list.groupBy(_._1).map(v => (v._1, v._2.map(_._2)))

Fold Left variation

向左折叠变化

list.foldLeft[Map[String, List[String]]](Map())((acc, value) => {
  acc.get(value._1).fold(acc ++ Map(value._1 -> List(value._2))){ v =>
    acc ++ Map(value._1 -> (value._2 :: v))
  }
})

Aggregate Variation - Similar to fold Left

聚合变化 - 类似于向左折叠

list.aggregate[Map[String, List[String]]](Map())(
  (acc, value) => acc.get(value._1).fold(acc ++ Map(value._1 -> 
    List(value._2))){ v =>
     acc ++ Map(value._1 -> (value._2 :: v))
  },
  (l, r) => l ++ r
)

Spark Variation - For big data sets (Conversion to a RDD and to a Plain Map from RDD)

Spark Variation - 对于大数据集（转换为 RDD 和从 RDD 转换为普通地图）

import org.apache.spark.rdd._
import org.apache.spark.{SparkContext, SparkConf}

val conf: SparkConf = new 
SparkConf().setAppName("Spark").setMaster("local")
val sc: SparkContext = new SparkContext (conf)

// This gives you a rdd of the same result
val rdd: RDD[(String, List[String])] = sc.parallelize(list).combineByKey(
   (value: String) => List(value),
   (acc: List[String], value) => value :: acc,
   (accLeft: List[String], accRight: List[String]) => accLeft ::: accRight
)

// To convert this RDD back to a Map[(String, List[String])] you can do the following
rdd.collect().toMap

Answer 7

回答by cevaris

Here is a more Scala idiomatic way to convert a list of tuples to a map handling duplicate keys. You want to use a fold.

这是将元组列表转换为处理重复键的映射的更 Scala 惯用方法。你想使用折叠。

val x = List("a" -> "b", "c" -> "d", "a" -> "f")

x.foldLeft(Map.empty[String, Seq[String]]) { case (acc, (k, v)) =>
  acc.updated(k, acc.getOrElse(k, Seq.empty[String]) ++ Seq(v))
}

res0: scala.collection.immutable.Map[String,Seq[String]] = Map(a -> List(b, f), c -> List(d))

Answer 8

回答by frankfzw

You can try this

你可以试试这个

scala> val b = new Array[Int](3)
// b: Array[Int] = Array(0, 0, 0)
scala> val c = b.map(x => (x -> x * 2))
// c: Array[(Int, Int)] = Array((1,2), (2,4), (3,6))
scala> val d = Map(c : _*)
// d: scala.collection.immutable.Map[Int,Int] = Map(1 -> 2, 2 -> 4, 3 -> 6)

scala 将元组列表转换为映射（并处理重复键？）

提问by Tg.

采纳答案by om-nom-nom

回答by Cory Klein

回答by Daniel C. Sobral

回答by pathikrit

回答by Xavier Guihot

回答by Melcom van Eeden

回答by cevaris

回答by frankfzw

相关推荐

最近更新

标签

scala 将元组列表转换为映射（并处理重复键？）

提问by Tg.

采纳答案by om-nom-nom

回答by Cory Klein

回答by Daniel C. Sobral

回答by pathikrit

回答by Xavier Guihot

回答by Melcom van Eeden

回答by cevaris

回答by frankfzw

相关推荐

scala 如何定义将由不可变 Set 比较方法使用的自定义相等操作

scala 确定 List 是否包含重复项的最简单方法？

scala 字符串模式匹配最佳实践

scala Scala反向字符串

相关推荐

最近更新

标签