scala 如何在 SPARK SQL 中使用 LEFT 和 RIGHT 关键字

Question

提问by Miruthan

I am new to spark SQL,

我是 Spark SQL 的新手，

In MS SQL, we have LEFT keyword, LEFT(Columnname,1) in('D','A') then 1 else 0.

在 MS SQL 中，我们有 LEFT 关键字，LEFT(Columnname,1) in('D','A') then 1 else 0.

How to implement the same in SPARK SQL. Kindly guide me

如何在 SPARK SQL 中实现相同的。请指导我

Answer 1

回答by zero323

You can use substringfunction with positive posto take from the left:

您可以使用substring带正数的函数pos从左侧获取：

import org.apache.spark.sql.functions.substring

substring(column, 0, 1)

and negative posto take from the right:

pos从右边取负：

substring(column, -1, 1)

So in Scala you can define

所以在 Scala 中你可以定义

import org.apache.spark.sql.Column
import org.apache.spark.sql.functions.substring

def left(col: Column, n: Int) = {
  assert(n >= 0)
  substring(col, 0, n)
}

def right(col: Column, n: Int) = {
  assert(n >= 0)
  substring(col, -n, n)
}

val df = Seq("foobar").toDF("str")

df.select(
  Seq(left _, right _).flatMap(f => (1 to 3).map(i => f($"str", i))): _*
).show

+--------------------+--------------------+--------------------+---------------------+---------------------+---------------------+
|substring(str, 0, 1)|substring(str, 0, 2)|substring(str, 0, 3)|substring(str, -1, 1)|substring(str, -2, 2)|substring(str, -3, 3)|
+--------------------+--------------------+--------------------+---------------------+---------------------+---------------------+
|                   f|                  fo|                 foo|                    r|                   ar|                  bar|
+--------------------+--------------------+--------------------+---------------------+---------------------+---------------------+

Similarly in Python:

在 Python 中类似：

from pyspark.sql.functions import substring
from pyspark.sql.column import Column

def left(col, n):
    assert isinstance(col, (Column, str))
    assert isinstance(n, int) and n >= 0
    return substring(col, 0, n)

def right(col, n):
    assert isinstance(col, (Column, str))
    assert isinstance(n, int) and n >= 0
    return substring(col, -n, n)

Answer 2

回答by Ryan Widmaier

To build upon user6910411's answer, you can also use isin and then to build a new column with the result of your character comparison.

要建立在 user6910411 的回答的基础上，您还可以使用 isin 然后使用字符比较的结果构建一个新列。

Final full code would look something like this

最终的完整代码看起来像这样

import org.apache.spark.sql.functions._

df.select(substring($"Columnname", 0, 1) as "ch")
    .withColumn("result", when($"ch".isin("D", "A"), 1).otherwise(0))

Answer 3

回答by Nagesh Singh Chauhan

import org.apache.spark.sql.functions._

Use substring(column, 0, 1)instead of LEFTfunction.

使用substring(column, 0, 1)代替LEFT功能。

where

在哪里

0 : starting position in the string
1 : Number of characters to be selected

0 : 字符串中的起始位置
1 : 要选择的字符数

Example : Consider a LEFT function :

示例：考虑一个 LEFT 函数：

LEFT(upper(SKU),2)

Corresponding SparkSQL statement would be :

相应的 SparkSQL 语句将是：

substring(upper(SKU),1,2)

scala 如何在 SPARK SQL 中使用 LEFT 和 RIGHT 关键字

提问by Miruthan

回答by zero323

回答by Ryan Widmaier

回答by Nagesh Singh Chauhan

相关推荐

最近更新

标签

scala 如何在 SPARK SQL 中使用 LEFT 和 RIGHT 关键字

提问by Miruthan

回答by zero323

回答by Ryan Widmaier

回答by Nagesh Singh Chauhan

相关推荐

scala 如果地图功能中的条件

scala Scala在两个字符集之间转换字符串

scala 如何从火花数据框中过滤掉空值

scala 如何读取akka-http中的查询参数？

相关推荐

最近更新

标签