为什么在工作线程日志中没有堆栈跟踪"OutOfMemoryError: Java heap space"?



我有一个火花作业,它因GC\堆空间错误而失败。当我检查终端时,我可以看到堆栈跟踪:

Caused by: org.spark_project.guava.util.concurrent.ExecutionError: java.lang.OutOfMemoryError: Java heap space
    at org.spark_project.guava.cache.LocalCache$Segment.get(LocalCache.java:2261)
    at org.spark_project.guava.cache.LocalCache.get(LocalCache.java:4000)
    at org.spark_project.guava.cache.LocalCache.getOrLoad(LocalCache.java:4004)
    at org.spark_project.guava.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874)
    at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.compile(CodeGenerator.scala:890)
    at org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:357)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:135)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
    at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:132)
    at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:113)
    at org.apache.spark.sql.execution.exchange.ShuffleExchange.prepareShuffleDependency(ShuffleExchange.scala:85)
    at org.apache.spark.sql.execution.exchange.ShuffleExchange$$anonfun$doExecute$1.apply(ShuffleExchange.scala:121)
    at org.apache.spark.sql.execution.exchange.ShuffleExchange$$anonfun$doExecute$1.apply(ShuffleExchange.scala:112)
    at     org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:52)
    ... 77 more
Caused by: java.lang.OutOfMemoryError: Java heap space
    at java.util.HashMap.resize(HashMap.java:703)
    at java.util.HashMap.putVal(HashMap.java:628)
    at java.util.HashMap.putMapEntries(HashMap.java:514)
    at java.util.HashMap.putAll(HashMap.java:784)
    at org.codehaus.janino.UnitCompiler.buildLocalVariableMap(UnitCompiler.java:3073)
    at org.codehaus.janino.UnitCompiler.access$4900(UnitCompiler.java:206)
    at org.codehaus.janino.UnitCompiler$8.visitLocalVariableDeclarationStatement(UnitCompiler.java:2958)
    at org.codehaus.janino.UnitCompiler$8.visitLocalVariableDeclarationStatement(UnitCompiler.java:2926)
    at org.codehaus.janino.Java$LocalVariableDeclarationStatement.accept(Java.java:2974)
    at org.codehaus.janino.UnitCompiler.buildLocalVariableMap(UnitCompiler.java:2925)
    at org.codehaus.janino.UnitCompiler.buildLocalVariableMap(UnitCompiler.java:3033)
    at org.codehaus.janino.UnitCompiler.access$4400(UnitCompiler.java:206)
    at org.codehaus.janino.UnitCompiler$8.visitSwitchStatement(UnitCompiler.java:2950)
    at org.codehaus.janino.UnitCompiler$8.visitSwitchStatement(UnitCompiler.java:2926)
    at org.codehaus.janino.Java$SwitchStatement.accept(Java.java:2866)
    at org.codehaus.janino.UnitCompiler.buildLocalVariableMap(UnitCompiler.java:2925)
    at org.codehaus.janino.UnitCompiler.buildLocalVariableMap(UnitCompiler.java:2982)
    at org.codehaus.janino.UnitCompiler.access$3800(UnitCompiler.java:206)
    at org.codehaus.janino.UnitCompiler$8.visitBlock(UnitCompiler.java:2944)
    at org.codehaus.janino.UnitCompiler$8.visitBlock(UnitCompiler.java:2926)
    at org.codehaus.janino.Java$Block.accept(Java.java:2471)
    at org.codehaus.janino.UnitCompiler.buildLocalVariableMap(UnitCompiler.java:2925)
    at org.codehaus.janino.UnitCompiler.buildLocalVariableMap(UnitCompiler.java:2999)
    at org.codehaus.janino.UnitCompiler.access$4000(UnitCompiler.java:206)
    at org.codehaus.janino.UnitCompiler$8.visitForStatement(UnitCompiler.java:2946)
    at org.codehaus.janino.UnitCompiler$8.visitForStatement(UnitCompiler.java:2926)
    at org.codehaus.janino.Java$ForStatement.accept(Java.java:2660)
    at org.codehaus.janino.UnitCompiler.buildLocalVariableMap(UnitCompiler.java:2925)
    at org.codehaus.janino.UnitCompiler.buildLocalVariableMap(UnitCompiler.java:2982)
    at org.codehaus.janino.UnitCompiler.access$3800(UnitCompiler.java:206)
    at org.codehaus.janino.UnitCompiler$8.visitBlock(UnitCompiler.java:2944)
    at org.codehaus.janino.UnitCompiler$8.visitBlock(UnitCompiler.java:2926)

问题是堆栈跟踪没有出现在我使用 webUI 或直接检查磁盘上文件的任何工作日志(stdout 和 stderr(上。

我在应用程序上确实有一个失败的执行器,它只是显示(stdout(:

17:12:17,008 ERROR [TransportResponseHandler] Still have 1 requests outstanding when connection from /<IP1>:35482 is closed
17:12:17,010 ERROR [CoarseGrainedExecutorBackend] Executor self-exiting due to : Driver <IP1>:35482 disassociated! Shutting down.

标准文件为空。

这对我来说是一个大问题,因为我并不总是在控制台中看到整个日志/堆栈跟踪,并且我正在寻找可靠/持久模式的东西。

org.codehaus.janino

用于整个阶段的Java代码生成(请参阅堆栈跟踪中带有org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute的行(,该代码作为查询优化的一部分(在RDD准备好执行之前(发生在驱动程序上。

问题是堆栈跟踪没有出现在我使用 webUI 或直接检查磁盘上文件的任何工作日志(stdout 和 stderr(上。

任何工作

线程日志中都不应有堆栈跟踪,因为尚未提交任何内容以在执行器(因此在工作线程(上执行。在执行程序执行它之前,它已经失败了。

相关内容

  • 没有找到相关文章

最新更新