scala spark数据框修剪列并转换
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/40445651/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
spark dataframe trim column and convert
提问by user1615666
In Scala / Spark, how to convert empty string, like " ", to "NULL" ? need to trim it first and then convert to "NULL". Thanks.
在 Scala / Spark 中,如何将空字符串(如“”)转换为“NULL”?需要先修剪它,然后转换为“NULL”。谢谢。
dataframe.na.replace("cut", Map(" " -> "NULL")).show //wrong
采纳答案by zero323
You can create a simple function to do it. First a couple of imports:
您可以创建一个简单的函数来执行此操作。首先是几个进口:
import org.apache.spark.sql.functions.{trim, length, when}
import org.apache.spark.sql.Column
and the definition:
和定义:
def emptyToNull(c: Column) = when(length(trim(c)) > 0, c)
Finally a quick test:
最后一个快速测试:
val df = Seq(" ", "foo", "", "bar").toDF
df.withColumn("value", emptyToNull($"value"))
which should yield following result:
这应该产生以下结果:
+-----+
|value|
+-----+
| null|
| foo|
| null|
| bar|
+-----+
If you want to replace empty string with string"NULLyou can add otherwiseclause:
如果你想用字符串替换空字符串,"NULL你可以添加otherwise子句:
def emptyToNullString(c: Column) = when(length(trim(c)) > 0, c).otherwise("NULL")
回答by Kapil
Please use the package below to resolve issue
请使用下面的包来解决问题
import org.apache.spark.sql.functions.trim

