scala 如何在 Spark SQL 中使用 CROSS JOIN 和 CROSS APPLY

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/40763682/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-22 08:51:46  来源:igfitidea点击:

How to use CROSS JOIN and CROSS APPLY in Spark SQL

scalaapache-sparkapache-spark-sql

提问by Miruthan

I am very new to Spark and Scala, I writing Spark SQL code. I am in situation to apply CROSS JOIN and CROSS APPLY in my logic. Here I will post the SQL query which I have to convert to spark SQL.

我对 Spark 和 Scala 很陌生,我正在编写 Spark SQL 代码。我可以在我的逻辑中应用 CROSS JOIN 和 CROSS APPLY。在这里,我将发布必须转换为 Spark SQL 的 SQL 查询。

select Table1.Column1,Table2.Column2,Table3.Column3
from Table1 CROSS JOIN Table2 CROSS APPLY Table3

I need the above query to convert in to SQLContext in Spark SQL. Kindly help me. Thanks in Advance.

我需要将上述查询转换为 Spark SQL 中的 SQLContext。请帮助我。提前致谢。

回答by SanthoshPrasad

First set the below property in spark conf

首先在 spark conf 中设置以下属性

spark.sql.crossJoin.enabled=true

then dataFrame1.join(dataFrame2)will do Cross/Cartesian join,

然后dataFrame1.join(dataFrame2)将做交叉/笛卡尔连接,

we can use below query also for doing the same

我们也可以使用下面的查询来做同样的事情

sqlContext.sql("select * from table1 CROSS JOIN table2 CROSS JOIN table3...")

回答by Swadeshi

Set Spark Configuration ,

设置 Spark 配置,

var sparkConf: SparkConf = null

 sparkConf = new SparkConf()

.set("spark.sql.crossJoin.enabled", "true")

Explicit Cross Join in spark 2.x using crossJoin Method

使用 crossJoin 方法在 spark 2.x 中显式交叉联接

crossJoin(right: Dataset[_]): DataFrame

crossJoin(右:Dataset[_]):DataFrame

var df_new = df1.crossJoin(df2);

Note : Cross joins are one of the most time consuming joins and often should be avoided.

注意:交叉联接是最耗时的联接之一,应经常避免。