SPARK SQL - 然后的情况

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/25157451/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 02:23:35  来源:igfitidea点击:

SPARK SQL - case when then

sqlapache-spark

提问by user3279189

I'm new to SPARK-SQL. Is there an equivalent to "CASE WHEN 'CONDITION' THEN 0 ELSE 1 END" in SPARK SQL ?

我是 SPARK-SQL 的新手。SPARK SQL 中是否有等效于“CASE WHEN 'CONDITION' THEN 0 ELSE 1 END”的内容?

select case when 1=1 then 1 else 0 end from table

select case when 1=1 then 1 else 0 end from table

Thanks Sridhar

谢谢斯里达

回答by Spiro Michaylov

Before Spark 1.2.0

Spark 1.2.0 之前

The supported syntax (which I just tried out on Spark 1.0.2) seems to be

支持的语法(我刚刚在 Spark 1.0.2 上尝试过)似乎是

SELECT IF(1=1, 1, 0) FROM table

This recent thread http://apache-spark-user-list.1001560.n3.nabble.com/Supported-SQL-syntax-in-Spark-SQL-td9538.htmllinks to the SQL parser source, which may or may not help depending on your comfort with Scala. At the very least the list of keywords starting (at time of writing) on line 70 should help.

这个最近的线程http://apache-spark-user-list.1001560.n3.nabble.com/Supported-SQL-syntax-in-Spark-SQL-td9538.html链接到 SQL 解析器源,可能会也可能不会帮助取决于您对 Scala 的舒适度。至少从第 70 行开始(在撰写本文时)的关键字列表应该会有所帮助。

Here's the direct link to the source for convenience: https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SqlParser.scala.

为方便起见,这是源的直接链接:https: //github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SqlParser.scala

Update for Spark 1.2.0 and beyond

Spark 1.2.0 及更高版本的更新

As of Spark 1.2.0, the more traditional syntax is supported, in response to SPARK-3813: search for "CASE WHEN" in the test source. For example:

从 Spark 1.2.0 开始,支持更传统的语法,以响应SPARK-3813:在测试源中搜索“CASE WHEN” 。例如:

SELECT CASE WHEN key = 1 THEN 1 ELSE 2 END FROM testData

Update for most recent place to figure out syntax from the SQL Parser

更新最近的地方以从 SQL 解析器中找出语法

The parser source can now be found here.

现在可以在此处找到解析器源代码。

Update for more complex examples

更新更复杂的例子

In response to a question below, the modern syntax supports complex Boolean conditions.

针对以下问题,现代语法支持复杂的布尔条件。

SELECT
    CASE WHEN id = 1 OR id = 2 THEN "OneOrTwo" ELSE "NotOneOrTwo" END AS IdRedux
FROM customer

You can involve multiple columns in the condition.

您可以在条件中涉及多个列。

SELECT
    CASE WHEN id = 1 OR state = 'MA' 
         THEN "OneOrMA" 
         ELSE "NotOneOrMA" END AS IdRedux
FROM customer

You can also nest CASE WHEN THEN expression.

您还可以嵌套 CASE WHEN THEN 表达式。

SELECT
    CASE WHEN id = 1 
         THEN "OneOrMA"
         ELSE
             CASE WHEN state = 'MA' THEN "OneOrMA" ELSE "NotOneOrMA" END
    END AS IdRedux
FROM customer

回答by Ehud Lev

For Spark 2.+Spark when function

对于 Spark 2.+ Spark when 函数

From documentation:

从文档:

Evaluates a list of conditions and returns one of multiple possible result expressions. If otherwise is not defined at the end, null is returned for unmatched conditions.

评估条件列表并返回多个可能的结果表达式之一。如果最后未定义其他内容,则为不匹配的条件返回 null。

 // Example: encoding gender string column into integer.

   // Scala:
   people.select(when(people("gender") === "male", 0)
     .when(people("gender") === "female", 1)
     .otherwise(2))

   // Java:
   people.select(when(col("gender").equalTo("male"), 0)
     .when(col("gender").equalTo("female"), 1)
     .otherwise(2))

回答by swapnil shashank

Based on my current production code, this works

   val identifierDF = 
   tempIdentifierDF.select(tempIdentifierDF("t_item_account_id"),
   when(tempIdentifierDF("h_description").contains(tempIdentifierDF("t_cusip")),100)
        .when(tempIdentifierDF("h_description").contains(tempIdentifierDF("t_ticker")),100)
        .when(tempIdentifierDF("h_description").contains(tempIdentifierDF("t_isin")),100)
        .when(tempIdentifierDF("h_description").contains(tempIdentifierDF("t_sedol")),100)
        .when(tempIdentifierDF("h_description").contains(tempIdentifierDF("t_valoren")),100)
        .otherwise(0)
        .alias("identifier_in_description_score")
    )