scala 如何在 Spark SQL 中使用 CROSS JOIN 和 CROSS APPLY
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/40763682/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to use CROSS JOIN and CROSS APPLY in Spark SQL
提问by Miruthan
I am very new to Spark and Scala, I writing Spark SQL code. I am in situation to apply CROSS JOIN and CROSS APPLY in my logic. Here I will post the SQL query which I have to convert to spark SQL.
我对 Spark 和 Scala 很陌生,我正在编写 Spark SQL 代码。我可以在我的逻辑中应用 CROSS JOIN 和 CROSS APPLY。在这里,我将发布必须转换为 Spark SQL 的 SQL 查询。
select Table1.Column1,Table2.Column2,Table3.Column3
from Table1 CROSS JOIN Table2 CROSS APPLY Table3
I need the above query to convert in to SQLContext in Spark SQL. Kindly help me. Thanks in Advance.
我需要将上述查询转换为 Spark SQL 中的 SQLContext。请帮助我。提前致谢。
回答by SanthoshPrasad
First set the below property in spark conf
首先在 spark conf 中设置以下属性
spark.sql.crossJoin.enabled=true
then dataFrame1.join(dataFrame2)will do Cross/Cartesian join,
然后dataFrame1.join(dataFrame2)将做交叉/笛卡尔连接,
we can use below query also for doing the same
我们也可以使用下面的查询来做同样的事情
sqlContext.sql("select * from table1 CROSS JOIN table2 CROSS JOIN table3...")
回答by Swadeshi
Set Spark Configuration ,
设置 Spark 配置,
var sparkConf: SparkConf = null
sparkConf = new SparkConf()
.set("spark.sql.crossJoin.enabled", "true")
Explicit Cross Join in spark 2.x using crossJoin Method
使用 crossJoin 方法在 spark 2.x 中显式交叉联接
crossJoin(right: Dataset[_]): DataFrame
crossJoin(右:Dataset[_]):DataFrame
var df_new = df1.crossJoin(df2);
Note : Cross joins are one of the most time consuming joins and often should be avoided.
注意:交叉联接是最耗时的联接之一,应经常避免。

