scala 如何在scala中进行外部连接

Question

提问by Newbie

I havce two data frames : df1 and df2

我有两个数据框：df1 和 df2

df1

|--- id---|---value---|
|    1    |    23     |
|    2    |    23     |
|    3    |    23     |
|    2    |    25     |
|    5    |    25     |

df2

|-idValue-|---count---|
|    1    |    33     |
|    2    |    23     |
|    3    |    34     |
|    13   |    34     |
|    23   |    34     |

How do I get this ?

我怎么得到这个？

|--- id--------|---value---|---count---|
|    1         |    23     |    33     |
|    2         |    23     |    23     |
|    3         |    23     |    34     |
|    2         |    25     |    23     |
|    5         |    25     |    null   |

I am doing :

我在做：

 val groupedData =  df1.join(df2, $"id" === $"idValue", "outer")

But I don't see the last column in the groupedData. Is this correct way of doing ? Or Am I doing any thing wrong ?

但我没有看到 groupedData 中的最后一列。这是正确的做法吗？还是我做错了什么？

Answer 1

回答by KiranM

From your expected output, you need LEFT OUTER JOIN.

根据您的预期输出，您需要 LEFT OUTER JOIN。

val groupedData =  df1.join(df2, $"id" === $"idValue", "left_outer").
       select(df1("id"), df1("count"), df2("count")).
       take(10).foreach(println)

scala 如何在scala中进行外部连接

提问by Newbie

回答by KiranM

相关推荐

最近更新

标签

scala 如何在scala中进行外部连接

提问by Newbie

回答by KiranM

相关推荐

在 Scala 的 Seq 中添加一个项目

scala 如何在spark数据帧/spark sql中读取带有模式的json

scala 快速获取数据框中的记录数

scala 错误：未找到：值 StructType/StructField/StringType

相关推荐

最近更新

标签