Scala 数据帧中的 rlike 给出了错误



我正在尝试将下面的Hive SQL语句转换为Spark数据帧,但出现错误。

case when (lower(message_txt) rlike '.*sampletext(\s?is\s?)newtext.*' ) then 'P' else 'Y'

示例数据:message_txt = "This is new sampletext, followed by newtext"

请帮助我提供等效的火花数据帧语句。

使用when(lower($"value").rlike(""".sampletext(siss?)newtext."""),lit('P')).otherwise("Y")

scala> df.withColumn("condition",when(lower($"value").rlike(""".sampletext(s?iss?)newtext."""),lit('P')).otherwise("Y")).show(false)
+-------------------------------------------+---------+
|value                                      |condition|
+-------------------------------------------+---------+
|This is new sampletext, followed by newtext|Y        |
+-------------------------------------------+---------+

在sql 中case statement末尾添加end


Example:

In spark Sql:

val df=Seq(("This is new sampletext, followed by newtext")).toDF("message_txt")
df.createOrReplaceTempView("tmp")
spark.sql("select case when (lower(message_txt) rlike '.sampletext(\s?is\s?)newtext.' ) then 'P' else 'Y' end from tmp").show()
//Result
//+--------------------------------------------------------------------------------+
//|CASE WHEN lower(message_txt) RLIKE .sampletext(s?iss?)newtext. THEN P ELSE Y END|
//+--------------------------------------------------------------------------------+
//|                                                                               Y|
//+--------------------------------------------------------------------------------+

In dataframe API:

df.withColumn("status", when(lower(col("message_txt")).rlike(".sampletext(\s?is\s?)newtext."),"P").otherwise("Y")).show()
//Result
//+--------------------+------+
//|         message_txt|status|
//+--------------------+------+
//|This is new sampl...|     Y|
//+--------------------+------+

UPDATE:

检查message_txt列中的字符串示例文本和新文本。

//using rlike
df.withColumn("status", when(lower(col("message_txt")).rlike("sampletext.*newtext"),"P").otherwise("Y")).show()
//using like
df.withColumn("status", when(lower(col("message_txt")).like("%sampletext%newtext%"),"P").otherwise("Y")).show()
//+--------------------+------+
//|         message_txt|status|
//+--------------------+------+
//|This is new sampl...|     P|
//+--------------------+------+

最新更新