使用hive context运行query时,Spark-hive异常:org.apache.spark.sql.Ana



使用Hive with Spark:

使用hive context依次执行两个查询:

hiveContext.sql("use userdb");

和低于log:

2016-09-08 15:46:13 main [INFO ] ParseDriver - Parsing command: use userdb
2016-09-08 15:46:14 main [INFO ] ParseDriver - Parse Completed
2016-09-08 15:46:21 main [INFO ] PerfLogger - <PERFLOG method=Driver.run from=org.apache.hadoop.hive.ql.Driver>
2016-09-08 15:46:21 main [INFO ] PerfLogger - <PERFLOG method=TimeToSubmit from=org.apache.hadoop.hive.ql.Driver>
2016-09-08 15:46:21 main [INFO ] PerfLogger - <PERFLOG method=compile from=org.apache.hadoop.hive.ql.Driver>
2016-09-08 15:46:22 main [INFO ] PerfLogger - <PERFLOG method=parse from=org.apache.hadoop.hive.ql.Driver>
2016-09-08 15:46:22 main [INFO ] ParseDriver - Parsing command: use userdb
2016-09-08 15:46:23 main [INFO ] ParseDriver - Parse Completed
2016-09-08 15:46:23 main [INFO ] PerfLogger - </PERFLOG method=parse start=1473329782037 end=1473329783188 duration=1151 from=org.apache.hadoop.hive.ql.Driver>
2016-09-08 15:46:23 main [INFO ] PerfLogger - <PERFLOG method=semanticAnalyze from=org.apache.hadoop.hive.ql.Driver>
2016-09-08 15:46:23 main [INFO ] Driver - Semantic Analysis Completed
2016-09-08 15:46:23 main [INFO ] PerfLogger - </PERFLOG method=semanticAnalyze start=1473329783202 end=1473329783396 duration=194 from=org.apache.hadoop.hive.ql.Driver>
2016-09-08 15:46:23 main [INFO ] Driver - Returning Hive schema: Schema(fieldSchemas:null, properties:null)
2016-09-08 15:46:23 main [INFO ] PerfLogger - </PERFLOG method=compile start=1473329781862 end=1473329783434 duration=1572 from=org.apache.hadoop.hive.ql.Driver>
2016-09-08 15:46:23 main [INFO ] Driver - Concurrency mode is disabled, not creating a lock manager
2016-09-08 15:46:23 main [INFO ] PerfLogger - <PERFLOG method=Driver.execute from=org.apache.hadoop.hive.ql.Driver>
2016-09-08 15:46:23 main [INFO ] Driver - Starting command(queryId=abc_20160908154622_aac49c43-565e-4fde-be6d-2d5c22c1a699): use userdb
2016-09-08 15:46:23 main [INFO ] PerfLogger - </PERFLOG method=TimeToSubmit start=1473329781861 end=1473329783682 duration=1821 from=org.apache.hadoop.hive.ql.Driver>
2016-09-08 15:46:23 main [INFO ] PerfLogger - <PERFLOG method=runTasks from=org.apache.hadoop.hive.ql.Driver>
2016-09-08 15:46:23 main [INFO ] PerfLogger - <PERFLOG method=task.DDL.Stage-0 from=org.apache.hadoop.hive.ql.Driver>
2016-09-08 15:46:23 main [INFO ] Driver - Starting task [Stage-0:DDL] in serial mode
2016-09-08 15:46:23 main [INFO ] PerfLogger - </PERFLOG method=runTasks start=1473329783682 end=1473329783729 duration=47 from=org.apache.hadoop.hive.ql.Driver>
2016-09-08 15:46:23 main [INFO ] PerfLogger - </PERFLOG method=Driver.execute start=1473329783435 end=1473329783730 duration=295 from=org.apache.hadoop.hive.ql.Driver>
2016-09-08 15:46:23 main [INFO ] Driver - OK
2016-09-08 15:46:23 main [INFO ] PerfLogger - <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver>
2016-09-08 15:46:23 main [INFO ] PerfLogger - </PERFLOG method=releaseLocks start=1473329783734 end=1473329783734 duration=0 from=org.apache.hadoop.hive.ql.Driver>
2016-09-08 15:46:23 main [INFO ] PerfLogger - </PERFLOG method=Driver.run start=1473329781861 end=1473329783735 duration=1874 from=org.apache.hadoop.hive.ql.Driver>
**But when trying to execute below query, getting error show below**
    hiveContext.sql("select * from user_detail")
   **Error:**
 2016-09-08 15:47:50 main [INFO ] ParseDriver - Parsing command: select * from userdb.user_detail
2016-09-08 15:47:50 main [INFO ] ParseDriver - Parse Completed
org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.parse.ASTNode cannot be cast to org.antlr.runtime.tree.CommonTree;
    at org.apache.spark.sql.hive.HiveQl$.createPlan(HiveQl.scala:324)
    at org.apache.spark.sql.hive.ExtendedHiveQlParser$$anonfun$hiveQl$1.apply(ExtendedHiveQlParser.scala:41)
    at org.apache.spark.sql.hive.ExtendedHiveQlParser$$anonfun$hiveQl$1.apply(ExtendedHiveQlParser.scala:40)
    at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:136)
    at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:135)
    at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
    at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
    at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)
    at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:254)
    at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:254)
    at scala.util.parsing.combinator.Parsers$Failure.append(Parsers.scala:202)
    at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
    at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
    at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)
    at scala.util.parsing.combinator.Parsers$$anon$2$$anonfun$apply$14.apply(Parsers.scala:891)
    at scala.util.parsing.combinator.Parsers$$anon$2$$anonfun$apply$14.apply(Parsers.scala:891)
    at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
    at scala.util.parsing.combinator.Parsers$$anon$2.apply(Parsers.scala:890)
    at scala.util.parsing.combinator.PackratParsers$$anon$1.apply(PackratParsers.scala:110)
    at org.apache.spark.sql.catalyst.AbstractSparkSQLParser.parse(AbstractSparkSQLParser.scala:34)
    at org.apache.spark.sql.hive.HiveQl$.parseSql(HiveQl.scala:295)
    at org.apache.spark.sql.hive.HiveQLDialect$$anonfun$parse$1.apply(HiveContext.scala:66)
    at org.apache.spark.sql.hive.HiveQLDialect$$anonfun$parse$1.apply(HiveContext.scala:66)
    at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$withHiveState$1.apply(ClientWrapper.scala:290)
    at org.apache.spark.sql.hive.client.ClientWrapper.liftedTree1$1(ClientWrapper.scala:237)
    at org.apache.spark.sql.hive.client.ClientWrapper.retryLocked(ClientWrapper.scala:236)
    at org.apache.spark.sql.hive.client.ClientWrapper.withHiveState(ClientWrapper.scala:279)
    at org.apache.spark.sql.hive.HiveQLDialect.parse(HiveContext.scala:65)
    at org.apache.spark.sql.SQLContext$$anonfun$2.apply(SQLContext.scala:211)
    at org.apache.spark.sql.SQLContext$$anonfun$2.apply(SQLContext.scala:211)
    at org.apache.spark.sql.execution.SparkSQLParser$$anonfun$org$apache$spark$sql$execution$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:114)
    at org.apache.spark.sql.execution.SparkSQLParser$$anonfun$org$apache$spark$sql$execution$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:113)
    at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:136)
    at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:135)
    at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
    at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
    at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)
    at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:254)
    at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:254)
    at scala.util.parsing.combinator.Parsers$Failure.append(Parsers.scala:202)
    at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
    at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
    at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)
    at scala.util.parsing.combinator.Parsers$$anon$2$$anonfun$apply$14.apply(Parsers.scala:891)
    at scala.util.parsing.combinator.Parsers$$anon$2$$anonfun$apply$14.apply(Parsers.scala:891)
    at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
    at scala.util.parsing.combinator.Parsers$$anon$2.apply(Parsers.scala:890)
    at scala.util.parsing.combinator.PackratParsers$$anon$1.apply(PackratParsers.scala:110)
    at org.apache.spark.sql.catalyst.AbstractSparkSQLParser.parse(AbstractSparkSQLParser.scala:34)
    at org.apache.spark.sql.SQLContext$$anonfun$1.apply(SQLContext.scala:208)
    at org.apache.spark.sql.SQLContext$$anonfun$1.apply(SQLContext.scala:208)
    at org.apache.spark.sql.execution.datasources.DDLParser.parse(DDLParser.scala:43)
    at org.apache.spark.sql.SQLContext.parseSql(SQLContext.scala:231)
    at org.apache.spark.sql.hive.HiveContext.parseSql(HiveContext.scala:331)
    at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:817)
    at 

我使用spark-hive_2.10: 1.6.1,它在内部解决了一些依赖:

  1. hive-exec: 1.2.1
  2. hive-metastore: 1.2.1 "

使用重复的api,最初,除了SELECT,我能够执行所有类型的查询(USE, INSERT, DESCRIBE等)。选择查询抛出上述异常。在解决了这个问题之后,现在我可以毫无问题地执行各种查询了。

当我遍历依赖层次结构时,我发现项目中包含了两个不同版本的hive-exec。我删除了外部的,解决了!。希望这篇文章对别人有所帮助。

谢谢。

最新更新