超过 64 KB 使用"When Otherwise"时出错

当我在Scala:中运行这个Spark代码时

df.withColumn(x, when(col(x).isin(values:_*),col(x)).otherwise(lit(null).cast(StringType)))

我面临这个错误：

java.lang.RuntimeException: Compiling "GeneratedClass": Code of method
"apply(Lorg/apache/spark/sql/catalyst/InternalRow;)Lorg/apache/spark/sql
/catalyst /expressions/UnsafeRow;" of class
"org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection"
grows beyond 64 KB
at org.codehaus.janino.UnitCompiler.compileUnit(UnitCompiler.java:361)
at org.codehaus.janino.SimpleCompiler.cook(SimpleCompiler.java:234)

df:Spark数据集

x： StringType列，每行类似于"；美国、华盛顿、西雅图；

values:Array[String]

这是一个已知的与字节码增长有关的问题。常见的解决方案是添加检查点，即保存数据帧并再次读取。

有关更多详细信息，请参阅以下内容：Apache Spark Codegen Stage增长超过64KB

相关内容

最新更新

热门标签：