什么是Apache spark.sql.types.DataTypes的java.time.LocalDate



我开发了java pojo类,其中包含java.time. localdate成员变量date

import java.io.Serializable;
import java.time.LocalDate;
import org.apache.spark.sql.types.DataTypes;
import org.apache.spark.sql.types.StructField;
import org.apache.spark.sql.types.StructType;
@Data
@AllArgsConstructor
@NoArgsConstructor
public class EntityMySQL implements Serializable {

@JsonFormat(pattern="yyyy-MM-dd")
@JsonDeserialize(using = LocalDateDeserializer.class)
private LocalDate date;

private float value;

private String id;

private String title;
private static StructType structType = DataTypes.createStructType(new StructField[] {

DataTypes.createStructField("date", DataTypes.DateType, false),  // this line throws Exception
DataTypes.createStructField("value", DataTypes.FloatType, false),
DataTypes.createStructField("id", DataTypes.StringType, false),
DataTypes.createStructField("title", DataTypes.StringType, false)
});

如你所见,日期"成员变量类型为java.time.LocalDate。但是在静态structType变量中,我将date的类型设置为DateTypes.DateType。当我绑定pojo类与spark数据帧。它抛出如下错误:

Caused by: java.lang.RuntimeException: java.time.LocalDate is not a valid external type for schema of date
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.StaticInvoke_0$(Unknown Source)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.writeFields_0_0$(Unknown Source)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown Source)
at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$Serializer.apply(ExpressionEncoder.scala:210)

当我将日期成员变量设置为java.util.Date时,sparkDataTypes.DateType是正确的配置,没有错误。但是在使用java.time.LocalDate的情况下,代码不能正常工作并抛出异常。如果我必须生成自定义日期类型,请告诉我如何。任何想法?

java.time。直到Spark都不支持LocalDate,即使你尝试为java Date类型编写编码器,它也不会工作。

我建议您将java.time.LocalDate转换为其他支持的类型,如java.sql.Timestamp或java.sql.Date或epoch或date-time in string.

最新更新