我使用
val df5 = spark.sql("select str_to_map('fruits=banana|sports=football','\|','=') as json_temp")
但输出不符合预期
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|json_temp |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|[f ->, r ->, u ->, i ->, t ->, s ->, -> , b ->, a ->, n ->, a ->, n ->, a ->, | ->, s ->, p ->, o ->, r ->, t ->, s ->, -> , f ->, o ->, o ->, t ->, b ->, a ->, l ->, l ->, ->]|
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
无法读取分隔符和分隔符。
我希望输出像
fruits -> banana, sports -> football
在Scala中使用四个反斜杠表示regex转义:
spark.sql("select str_to_map('fruits=banana|sports=football','\\|','=') as json_temp").show(false)
+--------------------------------------+
|json_temp |
+--------------------------------------+
|[fruits -> banana, sports -> football]|
+--------------------------------------+
或者使用三引号加两个反斜杠:
spark.sql("""select str_to_map('fruits=banana|sports=football','\|','=') as json_temp""").show(false)
+--------------------------------------+
|json_temp |
+--------------------------------------+
|[fruits -> banana, sports -> football]|
+--------------------------------------+
需要转义管道字符。你可以把它放在[]
:
val df5 = spark.sql(
"select str_to_map('fruits=banana|sports=football','[|]','=') as json_temp"
)
//+--------------------------------------+
//| json_temp |
//+--------------------------------------+
//|[fruits -> banana, sports -> football]|
//+--------------------------------------+