我开发了java pojo类,其中包含java.time. localdate成员变量date
。
import java.io.Serializable;
import java.time.LocalDate;
import org.apache.spark.sql.types.DataTypes;
import org.apache.spark.sql.types.StructField;
import org.apache.spark.sql.types.StructType;
@Data
@AllArgsConstructor
@NoArgsConstructor
public class EntityMySQL implements Serializable {
@JsonFormat(pattern="yyyy-MM-dd")
@JsonDeserialize(using = LocalDateDeserializer.class)
private LocalDate date;
private float value;
private String id;
private String title;
private static StructType structType = DataTypes.createStructType(new StructField[] {
DataTypes.createStructField("date", DataTypes.DateType, false), // this line throws Exception
DataTypes.createStructField("value", DataTypes.FloatType, false),
DataTypes.createStructField("id", DataTypes.StringType, false),
DataTypes.createStructField("title", DataTypes.StringType, false)
});
如你所见,日期"成员变量类型为java.time.LocalDate
。但是在静态structType变量中,我将date
的类型设置为DateTypes.DateType
。当我绑定pojo类与spark数据帧。它抛出如下错误:
Caused by: java.lang.RuntimeException: java.time.LocalDate is not a valid external type for schema of date
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.StaticInvoke_0$(Unknown Source)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.writeFields_0_0$(Unknown Source)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown Source)
at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$Serializer.apply(ExpressionEncoder.scala:210)
当我将日期成员变量设置为java.util.Date
时,sparkDataTypes.DateType
是正确的配置,没有错误。但是在使用java.time.LocalDate
的情况下,代码不能正常工作并抛出异常。如果我必须生成自定义日期类型,请告诉我如何。任何想法?
java.time。直到Spark都不支持LocalDate,即使你尝试为java Date类型编写编码器,它也不会工作。
我建议您将java.time.LocalDate转换为其他支持的类型,如java.sql.Timestamp或java.sql.Date或epoch或date-time in string.