scala 得到像需要结构类型这样的错误,但在spark scala中得到了简单结构类型的字符串

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/49128580/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-22 09:33:17  来源:igfitidea点击:

Getting error like need struct type but got string in spark scala for simple struct type

scalaapache-sparkspark-dataframe

提问by Ramesh Maharjan

Here is my schema

这是我的架构

root
 |-- DataPartition: string (nullable = true)
 |-- TimeStamp: string (nullable = true)
 |-- PeriodId: long (nullable = true)
 |-- FinancialAsReportedLineItemName: struct (nullable = true)
 |    |-- _VALUE: string (nullable = true)
 |    |-- _languageId: long (nullable = true)
 |-- FinancialLineItemSource: long (nullable = true)
 |-- FinancialStatementLineItemSequence: long (nullable = true)
 |-- FinancialStatementLineItemValue: double (nullable = true)
 |-- FiscalYear: long (nullable = true)
 |-- IsAnnual: boolean (nullable = true)
 |-- IsAsReportedCurrencySetManually: boolean (nullable = true)
 |-- IsCombinedItem: boolean (nullable = true)
 |-- IsDerived: boolean (nullable = true)
 |-- IsExcludedFromStandardization: boolean (nullable = true)
 |-- IsFinal: boolean (nullable = true)
 |-- IsTotal: boolean (nullable = true)
 |-- ParentLineItemId: long (nullable = true)
 |-- PeriodPermId: struct (nullable = true)
 |    |-- _VALUE: long (nullable = true)
 |    |-- _objectTypeId: long (nullable = true)
 |-- ReportedCurrencyId: long (nullable = true)

From the above schema i am trying to do like this

从上面的模式我想这样做

val temp = tempNew1
      .withColumn("FinancialAsReportedLineItemName", $"FinancialAsReportedLineItemName._VALUE")
      .withColumn("FinancialAsReportedLineItemName_languageId", $"FinancialAsReportedLineItemName._languageId")
      .withColumn("PeriodPermId", $"PeriodPermId._VALUE")
      .withColumn("PeriodPermId_objectTypeId", $"PeriodPermId._objectTypeId").drop($"AsReportedItem").drop($"AsReportedItem")

I don't know what i am missing here . I get below error

我不知道我在这里错过了什么。我得到以下错误

Exception in thread "main" org.apache.spark.sql.AnalysisException: Can't extract value from FinancialAsReportedLineItemName#2262: need struct type but got string;

线程“main”org.apache.spark.sql.AnalysisException 中的异常:无法从 FinancialAsReportedLineItemName#2262 中提取值:需要结构类型但得到字符串;

回答by Ramesh Maharjan

The issue is that you are trying to access FinancialAsReportedLineItemName._languageIdwhen FinancialAsReportedLineItemNamecolumn has been replaced by FinancialAsReportedLineItemName._VALUE

问题是FinancialAsReportedLineItemName._languageIdFinancialAsReportedLineItemName列已被替换时您正在尝试访问FinancialAsReportedLineItemName._VALUE

you should be changing the following two lines

你应该改变以下两行

.withColumn("FinancialAsReportedLineItemName", $"FinancialAsReportedLineItemName._VALUE")
.withColumn("FinancialAsReportedLineItemName_languageId", $"FinancialAsReportedLineItemName._languageId")

to

.withColumn("FinancialAsReportedLineItemName_value", $"FinancialAsReportedLineItemName._VALUE")
.withColumn("FinancialAsReportedLineItemName_languageId", $"FinancialAsReportedLineItemName._languageId")

If FinancialAsReportedLineItemName_valuecolumn name is supposed to be FinancialAsReportedLineItemNamethen you should be swapping the withColumnsas

如果FinancialAsReportedLineItemName_value列名应该是FinancialAsReportedLineItemName那么你应该交换withColumnsas

.withColumn("FinancialAsReportedLineItemName_languageId", $"FinancialAsReportedLineItemName._languageId")    
.withColumn("FinancialAsReportedLineItemName", $"FinancialAsReportedLineItemName._VALUE")