scala 如何在 SPARK SQL 中使用 LEFT 和 RIGHT 关键字
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/40136922/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to use LEFT and RIGHT keyword in SPARK SQL
提问by Miruthan
I am new to spark SQL,
我是 Spark SQL 的新手,
In MS SQL, we have LEFT keyword, LEFT(Columnname,1) in('D','A') then 1 else 0.
在 MS SQL 中,我们有 LEFT 关键字,LEFT(Columnname,1) in('D','A') then 1 else 0.
How to implement the same in SPARK SQL. Kindly guide me
如何在 SPARK SQL 中实现相同的。请指导我
回答by zero323
You can use substringfunction with positive posto take from the left:
您可以使用substring带正数的函数pos从左侧获取:
import org.apache.spark.sql.functions.substring
substring(column, 0, 1)
and negative posto take from the right:
pos从右边取负:
substring(column, -1, 1)
So in Scala you can define
所以在 Scala 中你可以定义
import org.apache.spark.sql.Column
import org.apache.spark.sql.functions.substring
def left(col: Column, n: Int) = {
assert(n >= 0)
substring(col, 0, n)
}
def right(col: Column, n: Int) = {
assert(n >= 0)
substring(col, -n, n)
}
val df = Seq("foobar").toDF("str")
df.select(
Seq(left _, right _).flatMap(f => (1 to 3).map(i => f($"str", i))): _*
).show
+--------------------+--------------------+--------------------+---------------------+---------------------+---------------------+
|substring(str, 0, 1)|substring(str, 0, 2)|substring(str, 0, 3)|substring(str, -1, 1)|substring(str, -2, 2)|substring(str, -3, 3)|
+--------------------+--------------------+--------------------+---------------------+---------------------+---------------------+
| f| fo| foo| r| ar| bar|
+--------------------+--------------------+--------------------+---------------------+---------------------+---------------------+
Similarly in Python:
在 Python 中类似:
from pyspark.sql.functions import substring
from pyspark.sql.column import Column
def left(col, n):
assert isinstance(col, (Column, str))
assert isinstance(n, int) and n >= 0
return substring(col, 0, n)
def right(col, n):
assert isinstance(col, (Column, str))
assert isinstance(n, int) and n >= 0
return substring(col, -n, n)
回答by Ryan Widmaier
To build upon user6910411's answer, you can also use isin and then to build a new column with the result of your character comparison.
要建立在 user6910411 的回答的基础上,您还可以使用 isin 然后使用字符比较的结果构建一个新列。
Final full code would look something like this
最终的完整代码看起来像这样
import org.apache.spark.sql.functions._
df.select(substring($"Columnname", 0, 1) as "ch")
.withColumn("result", when($"ch".isin("D", "A"), 1).otherwise(0))
回答by Nagesh Singh Chauhan
import org.apache.spark.sql.functions._
Use substring(column, 0, 1)instead of LEFTfunction.
使用substring(column, 0, 1)代替LEFT功能。
where
在哪里
- 0 : starting position in the string
- 1 : Number of characters to be selected
- 0 : 字符串中的起始位置
- 1 : 要选择的字符数
Example : Consider a LEFT function :
示例:考虑一个 LEFT 函数:
LEFT(upper(SKU),2)
Corresponding SparkSQL statement would be :
相应的 SparkSQL 语句将是:
substring(upper(SKU),1,2)

