我正在尝试在火花数据帧中应用 IN 子句
scala> val filteredDF = resultDF.select("role_id","role","full_name").filter(upper(resultDF("role")).isin(List("DIRECTOR","ACTOR")) )
尝试上述命令时,我收到错误
java.lang.RuntimeException: Unsupported literal type class scala.collection.immutable.$colon$colon List(DIRECTOR, ACTOR)
at org.apache.spark.sql.catalyst.expressions.Literal$.apply(literals.scala:49)
at org.apache.spark.sql.functions$.lit(functions.scala:89)
at org.apache.spark.sql.Column$$anonfun$isin$1.apply(Column.scala:642)
at org.apache.spark.sql.Column$$anonfun$isin$1.apply(Column.scala:642)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:34)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at org.apache.spark.sql.Column.isin(Column.scala:642)
有人可以帮助我解释为什么我会收到此错误以及如何解决此问题?
您需要
将值作为单独的参数传递给isin
:
.isin("DIRECTOR", "ACTOR")
或者使用 varargs 语法:
.isin(List("DIRECTOR", "ACTOR"): _*)