scala 无法解析 Spark Dataframe 中的列（数字列名称）

Question

提问by Marsellus Wallace

This is my data:

这是我的数据：

scala> data.printSchema
root
 |-- 1.0: string (nullable = true)
 |-- 2.0: string (nullable = true)
 |-- 3.0: string (nullable = true)

This doesn't work :(

这不起作用:(

scala> data.select("2.0").show

Exception:

例外：

org.apache.spark.sql.AnalysisException: cannot resolve '`2.0`' given input columns: [1.0, 2.0, 3.0];;
'Project ['2.0]
+- Project [_1#5608 AS 1.0#5615, _2#5609 AS 2.0#5616, _3#5610 AS 3.0#5617]
   +- LocalRelation [_1#5608, _2#5609, _3#5610]
        ...

Try this at home (I'm running on the shell v_2.1.0.5)!

在家里试试这个（我在 shell v_2.1.0.5 上运行）！

val data = spark.createDataFrame(Seq(
  ("Hello", ", ", "World!")
)).toDF("1.0", "2.0", "3.0")
data.select("2.0").show

Answer 1

回答by Psidom

You can use backticksto escape the dot, which is reserved for accessing columns for struct type:

您可以使用反引号来转义点，该点保留用于访问结构类型的列：

data.select("`2.0`").show
+---+
|2.0|
+---+
| , |
+---+

Answer 2

回答by Tawkir

The problem is you can not add dot character in the column name while selecting from dataframe. You can have a look at this question, kind of similar.

问题是您无法在从数据框中选择时在列名中添加点字符。你可以看看这个问题，有点类似。

val data = spark.createDataFrame(Seq(
  ("Hello", ", ", "World!")
)).toDF("1.0", "2.0", "3.0")
data.select(sanitize("2.0")).show

def sanitize(input: String): String = s"`$input`"

scala 无法解析 Spark Dataframe 中的列（数字列名称）

提问by Marsellus Wallace

回答by Psidom

回答by Tawkir

相关推荐

最近更新

标签

scala 无法解析 Spark Dataframe 中的列（数字列名称）

提问by Marsellus Wallace

回答by Psidom

回答by Tawkir

相关推荐

scala 从数据框火花中删除一列

scala 计算 Spark DataFrame 中非空值的数量

scala 多次触发数据帧分组

scala 来自 Spark Streaming 的 RestAPI 服务调用

相关推荐

最近更新

标签