scala SPARK:失败:“联合”预期但“(”找到
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/31786912/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
SPARK : failure: ``union'' expected but `(' found
提问by user1735076
I have a dataframe called df with column named employee_id. I am doing:
我有一个名为 df 的数据框,其中包含名为 employee_id 的列。我在做:
df.registerTempTable("d_f")
val query = """SELECT *, ROW_NUMBER() OVER (ORDER BY employee_id) row_number FROM d_f"""
val result = Spark.getSqlContext().sql(query)
But getting following issue. Any help?
但是遇到以下问题。有什么帮助吗?
[1.29] failure: ``union'' expected but `(' found
SELECT *, ROW_NUMBER() OVER (ORDER BY employee_id) row_number FROM d_f
^
java.lang.RuntimeException: [1.29] failure: ``union'' expected but `(' found
SELECT *, ROW_NUMBER() OVER (ORDER BY employee_id) row_number FROM d_f
回答by zero323
Spark 2.0+
火花 2.0+
Spark 2.0 introduces native implementation of window functions (SPARK-8641) so HiveContextshould be no longer required. Nevertheless similar errors, not related to window functions, can be still attributed to the differences between SQL parsers.
Spark 2.0 引入了窗口函数的本机实现 ( SPARK-8641),因此HiveContext不再需要。尽管如此,与窗口函数无关的类似错误仍然可以归因于 SQL 解析器之间的差异。
Spark <= 1.6
火花 <= 1.6
Window functions have been introduced in Spark 1.4.0 and require HiveContextto work. SQLContextwon't work here.
窗口函数已经在 Spark 1.4.0 中引入并且需要HiveContext工作。SQLContext不会在这里工作。
Be sure you you use Spark >= 1.4.0 and create the HiveContext:
确保您使用 Spark >= 1.4.0 并创建HiveContext:
import org.apache.spark.sql.hive.HiveContext
val sqlContext = new HiveContext(sc)
回答by Pelab
Yes It is true,
是的,它是真的,
I am using spark version 1.6.0 and there you need a HiveContext to implement 'dense_rank' method.
我使用的是 spark 1.6.0 版,您需要一个 HiveContext 来实现“dense_rank”方法。
From Spark 2.0.0 on words there will be no more 'dense_rank' method.
从 Spark 2.0.0 开始,单词上将不再有 'dense_rank' 方法。
So for Spark 1.4,1.6 <2.0 you should apply like this.
所以对于 Spark 1.4,1.6 <2.0 你应该这样申请。
table hive_employees having three fields :: place : String, name : String, salary : Int
表 hive_employees 具有三个字段 :: place : String, name : String,salary : Int
val conf = new SparkConf().setAppName("denseRank test")//.setMaster("local")
val conf = new SparkConf().setAppName("denseRank test")//.setMaster("local")
val sc = new SparkContext(conf)
val sqlContext = new SQLContext(sc)
val hqlContext = new org.apache.spark.sql.hive.HiveContext(sc)
val result = hqlContext.sql("select empid,empname, dense_rank() over(partition by empsalary order by empname) as rank from hive_employees")
val result = hqlContext.sql("select empid,empname,dense_rank() over(partition by empsalary order by empname) as rank from hive_employees")
result.show()
结果.show()

