scala 如何在 Spark SQL 中使用连字符转义列名

Question

提问by sfactor

I have imported a json file in Spark and convertd it into a table as

我在 Spark 中导入了一个 json 文件并将其转换为表

myDF.registerTempTable("myDF")

I then want to run SQL queries on this resulting table

然后我想在这个结果表上运行 SQL 查询

val newTable = sqlContext.sql("select column-1 from myDF")

However this gives me an error because of the hypen in the name of the column column-1. How do I resolve this is Spark SQL?

但是，由于列名称中的连字符，这给了我一个错误column-1。我如何解决这是 Spark SQL？

Answer 1

回答by PermaFrost

Backticks (`) appear to work, so

反引号 (`) 似乎有效，所以

val newTable = sqlContext.sql("select `column-1` from myDF")

should do the trick, at least in Spark v1.3.x.

应该可以解决问题，至少在 Spark v1.3.x 中是这样。

Answer 2

回答by GreenThumb

Was at it for a bit yesterday, turns out there is a way to escape the (:) and a (.) like so:

昨天玩了一会儿，结果发现有一种方法可以像这样转义 (:) 和 (.)：

Only the field containing (:) needs to be escaped with backticks

只有包含 (:) 的字段需要用反引号转义

sqlc.select("select `sn2:AnyAddRq`.AnyInfo.noInfo.someRef.myInfo.someData.Name AS sn2_AnyAddRq_AnyInfo_noInfo_someRef_myInfo_someData_Name from masterTable").show()

Answer 3

回答by GreenThumb

I cannot comment as I have less than 50 reps

我不能评论，因为我的代表少于 50 次

When you are referencing a json structure with struct.struct.field and there is a namespace present like:

当您使用 struct.struct.field 引用 json 结构并且存在如下命名空间时：

ns2:struct.struct.field the backticks(`) does not work.

ns2:struct.struct.field 反引号（`）不起作用。

jsonDF = sqlc.read.load('jsonMsgs', format="json")
jsonDF.registerTempTable("masterTable")
sqlc.select("select `sn2:AnyAddRq.AnyInfo.noInfo.someRef.myInfo.someData.Name` AS sn2_AnyAddRq_AnyInfo_noInfo_someRef_myInfo_someData_Name from masterTable").show()

pyspark.sql.utils.AnalysisException: u"cannot resolve 'sn2:AnyAddRq.AnyInfo.noInfo.someRef.myInfo.someData.Name'

pyspark.sql.utils.AnalysisException: u"无法解析 ' sn2:AnyAddRq.AnyInfo.noInfo.someRef.myInfo.someData.Name'

If I remove the sn2: fields, the query executes.

如果我删除了 sn2: 字段，查询就会执行。

I have also tried with single quote ('), backslash (\) and double quotes("")

我也试过单引号 (')、反斜杠 (\) 和双引号 ("")

The only way it works if if I register another temp table on the sn2: strucutre, I am able access the fields within it like so

如果我在 sn2: strucutre 上注册另一个临时表，它的唯一工作方式是，我可以像这样访问其中的字段

anotherDF = jsonDF.select("sn2:AnyAddRq.AnyInfo.noInfo.someRef.myInfo.someData")
anotherDF.registerTempTable("anotherDF")
sqlc.select("select Name from anotherDF").show()

scala 如何在 Spark SQL 中使用连字符转义列名

提问by sfactor

回答by PermaFrost

回答by GreenThumb

回答by GreenThumb

相关推荐

最近更新

标签

scala 如何在 Spark SQL 中使用连字符转义列名

提问by sfactor

回答by PermaFrost

回答by GreenThumb

回答by GreenThumb

相关推荐

scala 3 笔交易

scala 如何在Spark SQL中按列降序排序？

scala 如何在没有 sbt 命令的情况下在 IntelliJ IDEA 中调试/运行单个加特林模拟？

scala 将 Spark Row 转换为类型化的双精度数组

相关推荐

最近更新

标签