Spark SQL:字符串到时间戳转换:值更改为null



我对Spark SQL有一个问题,其中列类型如果我从字符串到时间戳打字,则值将变为null。以下是细节:

val df2 = sql("""select FROM_UNIXTIME(UNIX_TIMESTAMP(to_date(LAST_DAY(ADD_MONTHS(CONCAT_WS('-','2018','10','01'),0))),'yyyy-MM-dd'),'yyyyMMdd HH:mm:ss')""")
df2: org.apache.spark.sql.DataFrame = [from_unixtime(unix_timestamp(to_date(last_day(add_months(CAST(concat_ws(-, 2018, 10, 01) AS DATE), 0))), yyyy-MM-dd), yyyyMMdd HH:mm:ss): string]

scala> df2.show
+----------------------------------------------------------------------------------------------------------------------------------------+
|from_unixtime(unix_timestamp(to_date(last_day(add_months(CAST(concat_ws(-, 2018, 10, 01) AS DATE), 0))), yyyy-MM-dd), yyyyMMdd HH:mm:ss)|
+----------------------------------------------------------------------------------------------------------------------------------------+
|                                                                                                                       20181001 00:00:00|
+----------------------------------------------------------------------------------------------------------------------------------------+

明确向时间戳打字时,它不会给我带来所需的结果。

val df2 = sql("""select cast(FROM_UNIXTIME(UNIX_TIMESTAMP(to_date(LAST_DAY(ADD_MONTHS(CONCAT_WS('-','2018','10','01'),0))),'yyyy-MM-dd'),'yyyyMMdd HH:mm:ss') as timestamp)""")
df2: org.apache.spark.sql.DataFrame = [CAST(from_unixtime(unix_timestamp(to_date(last_day(add_months(CAST(concat_ws(-, 2018, 10, 01) AS DATE), 0))), yyyy-MM-dd), yyyyMMdd HH:mm:ss) AS TIMESTAMP): timestamp]

scala> df2.show
+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
|CAST(from_unixtime(unix_timestamp(to_date(last_day(add_months(CAST(concat_ws(-, 2018, 10, 01) AS DATE), 0))), yyyy-MM-dd), yyyyMMdd HH:mm:ss) AS TIMESTAMP)|
+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
|                                                                                                                                                       null|
+-----------------------------------------------------------------------------------------------------------------------------------------------------------+

解决它的想法?

尝试以下内容:

val df2 = spark.sql(
      """select CAST(unix_timestamp(FROM_UNIXTIME(UNIX_TIMESTAMP(to_date(LAST_DAY(ADD_MONTHS(CONCAT_WS('-','2018','10','01'),0))),'yyyy-MM-dd'),'yyyyMMdd HH:mm:ss'),'yyyyMMdd HH:mm:ss') as timestamp) as destination""".stripMargin)
df2.show(false)
df2.printSchema()
+-------------------+
|destination        |
+-------------------+
|2018-10-31 00:00:00|
+-------------------+
root
 |-- destination: timestamp (nullable = true)

我尝试过这样的尝试,而无需使用任何火花内部。

val df2 = sql("""cast(FROM_UNIXTIME(UNIX_TIMESTAMP(cast(LAST_DAY(ADD_MONTHS(CONCAT_WS('-','2018','12','31'),0)) as timestamp))) as timestamp)""")
scala> df2.show
+--------------------+
|2018-12-31 00:00:...|
+--------------------+

相关内容

  • 没有找到相关文章

最新更新