import sparkSession.sqlContext.implicits._
val df = Seq(("2014-10-06"), ("2014-10-07"), ("2014-10-08"), ("2014-10-09"), ("2014-10-10")).toDF("DATE")
df.printSchema()
import org.apache.spark.sql.functions.{col, to_date}
val df2 = df.withColumn("DATE", to_date(col("DATE"), "yyyy-MM-dd"))
df2.printSchema()
df2.write.mode(SaveMode.Overwrite).parquet("C:\TEMP\")
<小时 />root
|-- DATE: string (nullable = true)
root
|-- DATE: date (nullable = true)
在代码中,我能够将 DATE 列从字符串转换为日期类型,但是当输出镶木地板文件在打开时出现以下错误时:
镶木地板异常:读取列"DATE"时出现致命错误 System.ArgumentException:本地 dateTime 参数的 UTC 偏移量与偏移量参数不匹配。
谁能帮我解决这个问题?
我无法重现这个-
尝试写作和阅读相同
val df1 = Seq(("2014-10-06"), ("2014-10-07"), ("2014-10-08"), ("2014-10-09"), ("2014-10-10")).toDF("DATE")
df1.printSchema()
/**
* root
* |-- DATE: string (nullable = true)
*/
import org.apache.spark.sql.functions.{col, to_date}
val df2 = df1.withColumn("DATE", to_date(col("DATE"), "yyyy-MM-dd"))
df2.printSchema()
/**
* root
* |-- DATE: date (nullable = true)
*/
df2.write.mode(SaveMode.Overwrite).parquet("/Users/sokale/models/stack")
spark.read.parquet("/Users/sokale/models/stack").show(false)
/**
* +----------+
* |DATE |
* +----------+
* |2014-10-08|
* |2014-10-09|
* |2014-10-10|
* |2014-10-06|
* |2014-10-07|
* +----------+
*/