scala spark数据框修剪列并转换

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/40445651/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-22 08:50:00  来源:igfitidea点击:

spark dataframe trim column and convert

scalaapache-spark

提问by user1615666

In Scala / Spark, how to convert empty string, like " ", to "NULL" ? need to trim it first and then convert to "NULL". Thanks.

在 Scala / Spark 中,如何将空字符串(如“”)转换为“NULL”?需要先修剪它,然后转换为“NULL”。谢谢。

dataframe.na.replace("cut", Map(" " -> "NULL")).show //wrong

采纳答案by zero323

You can create a simple function to do it. First a couple of imports:

您可以创建一个简单的函数来执行此操作。首先是几个进口:

import org.apache.spark.sql.functions.{trim, length, when}
import org.apache.spark.sql.Column

and the definition:

和定义:

def emptyToNull(c: Column) = when(length(trim(c)) > 0, c)

Finally a quick test:

最后一个快速测试:

val df = Seq(" ", "foo", "", "bar").toDF
df.withColumn("value", emptyToNull($"value"))

which should yield following result:

这应该产生以下结果:

+-----+
|value|
+-----+
| null|
|  foo|
| null|
|  bar|
+-----+

If you want to replace empty string with string"NULLyou can add otherwiseclause:

如果你想用字符串替换空字符串,"NULL你可以添加otherwise子句:

def emptyToNullString(c: Column) = when(length(trim(c)) > 0, c).otherwise("NULL")

回答by Kapil

Please use the package below to resolve issue

请使用下面的包来解决问题

import org.apache.spark.sql.functions.trim