scala 如何将 Column.isin 与列表一起使用?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/32551919/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-22 07:36:33  来源:igfitidea点击:

How to use Column.isin with list?

scalaapache-sparkapache-spark-sql

提问by Nabegh

val items = List("a", "b", "c")

sqlContext.sql("select c1 from table")
          .filter($"c1".isin(items))
          .collect
          .foreach(println)

The code above throws the following exception.

上面的代码抛出以下异常。

Exception in thread "main" java.lang.RuntimeException: Unsupported literal type class scala.collection.immutable.$colon$colon List(a, b, c) 
at org.apache.spark.sql.catalyst.expressions.Literal$.apply(literals.scala:49)
at org.apache.spark.sql.functions$.lit(functions.scala:89)
at org.apache.spark.sql.Column$$anonfun$isin.apply(Column.scala:642)
at org.apache.spark.sql.Column$$anonfun$isin.apply(Column.scala:642)
at scala.collection.TraversableLike$$anonfun$map.apply(TraversableLike.scala:245)
at scala.collection.TraversableLike$$anonfun$map.apply(TraversableLike.scala:245)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:35)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:245)
at scala.collection.AbstractTraversable.map(Traversable.scala:104)
at org.apache.spark.sql.Column.isin(Column.scala:642)

Below is my attempt to fix it. It compiles and runs but doesn't return any match. Not sure why.

下面是我修复它的尝试。它编译并运行但不返回任何匹配项。不知道为什么。

val items = List("a", "b", "c").mkString("\"","\",\"","\"")

sqlContext.sql("select c1 from table")
          .filter($"c1".isin(items))
          .collect
          .foreach(println)

回答by Niemand

According to documentation, isintakes a vararg, not a list. List is actually a confusing name here. You can try converting your List to vararg like this:

根据文档,isin采用可变参数,而不是列表。List在这里实际上是一个令人困惑的名字。您可以尝试将您的 List 转换为 vararg ,如下所示:

val items = List("a", "b", "c")

sqlContext.sql("select c1 from table")
          .filter($"c1".isin(items:_*))
          .collect
          .foreach(println)

Your variant with mkString compiles, because one single String is also a vararg (with number of arguments equal to 1), but it is proably not what you want to achieve.

您使用 mkString 的变体可以编译,因为一个 String 也是一个可变参数(参数数量等于 1),但这可能不是您想要实现的。

回答by Anandkumar

It worked like this in Java Api (Java 8)

它在 Java Api (Java 8) 中是这样工作的

.isin(sampleListName.stream().toArray(String[]::new))));

sampleListName is a List

sampleListName 是一个列表

回答by Francis Toth

As Tomalak has mentioned it :

正如托马拉克所说:

isin(java.lang.Object... list)
A boolean expression that is evaluated to true if the value 
of this expression is contained by the evaluated values of the arguments.

Therefore, you just could fix this making the following change :

因此,您只需进行以下更改即可解决此问题:

val items = List("a", "b", "c").map(c => s""""$c"""")

回答by Lucas Lima

Spark has now (since 2.4.0) a method called isInCollection, which is just what you are looking for, instead of isIn.

Spark 现在(自 2.4.0 起)有一个名为 的方法isInCollection,这正是您要查找的,而不是isIn.

(shouldn't they unify the methods?)

(他们不应该统一方法吗?)

回答by Pedro H

Even easier:

更简单:

sqlContext.sql("select c1 from table")
          .filter($"c1".isin("a", "b", "c"))
          .collect
          .foreach(println)

Unless you have a lot of list values, which isn't the case usually.

除非您有很多列表值,通常情况并非如此。