Apache Pig - 错误 2229:找不到项目的匹配 uid -1



我在运行 Pig 脚本时遇到以下异常。

错误 2229:找不到项目(名称:项目)的匹配 uid -1 类型:字节数组 Uid:-1 输入:0 列:12)

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2000: Error processing rule ColumnMapKeyPrune. Try -t ColumnMapKeyPrune
    at org.apache.pig.newplan.optimizer.PlanOptimizer.optimize(PlanOptimizer.java:122)
    at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:274)
    at org.apache.pig.PigServer.compilePp(PigServer.java:1324)
    at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1249)
    at org.apache.pig.PigServer.execute(PigServer.java:1241)
    at org.apache.pig.PigServer.executeBatch(PigServer.java:335)
    at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:137)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:198)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170)
    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
    at org.apache.pig.Main.run(Main.java:604)
    at org.apache.pig.Main.main(Main.java:157)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2229: Couldn't find matching uid -1 for project (Name: Project Type: bytearray Uid: -1 Input: 0 Column: 12)
    at org.apache.pig.newplan.logical.optimizer.ProjectionPatcher$ProjectionRewriter.visit(ProjectionPatcher.java:91)
    at org.apache.pig.newplan.logical.expression.ProjectExpression.accept(ProjectExpression.java:207)
    at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:64)
    at org.apache.pig.newplan.DepthFirstWalker.walk(DepthFirstWalker.java:53)
    at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52)
    at org.apache.pig.newplan.logical.optimizer.AllExpressionVisitor.visitAll(AllExpressionVisitor.java:72)
    at org.apache.pig.newplan.logical.optimizer.AllExpressionVisitor.visit(AllExpressionVisitor.java:95)
    at org.apache.pig.newplan.logical.relational.LOJoin.accept(LOJoin.java:174)
    at org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
    at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52)
    at org.apache.pig.newplan.logical.optimizer.ProjectionPatcher.transformed(ProjectionPatcher.java:48)
    at org.apache.pig.newplan.optimizer.PlanOptimizer.optimize(PlanOptimizer.java:113)
    ... 16 more

可能是什么原因?我已经查看了脚本的扩展和替换版本,从语法的角度来看,我没有看到任何错误。

它是 0.11.1 版 (CDH 4.3) 中 Pig Optimizer 中的一个错误。这似乎与优化以下简化脚本的尝试有关

LOAD A  -- Primary Driver Table
LOAD B
LOAD C
J1 = JOIN A LEFT, B
J2 = JOIN J2 LEFT, C
LOAD D
J3 = JOIN J2, D -- Inner Join

理想情况下,如果 A 之前与 D 连接,那么流经连接 J1 和 J2 的数据可以减少,从而加快速度。

我想这种优化尝试失败了。

消除此错误的一种方法是确定如何"提升"联接 J3(内部联接)在脚本中更早发生。

相关内容

  • 没有找到相关文章

最新更新