使用spark引擎运行hive时找不到JavaSparkListener



Hive版本:2.0.0
Spark 2.3.0
Yarn作为调度器

它不兼容开箱即用,但我必须设置以下配置使其兼容。

spark.sql.hive.metastore.version 2.0.0
spark.sql.hive.metastore.jars /usr/local/apache-hive-2.0.0-bin/lib/*

我能够使用spark-sql在spark集群上成功运行hive查询。然而,当我使用hive cli运行查询时,我面临以下错误(如hive日志所示):

2021-10-17T03:06:53,727 INFO  [1ff8e619-80bb-46ea-9fd0-824d57ea3799 1ff8e619-80bb-46ea-9fd0-824d57ea3799 main]: client.SparkClientImpl (SparkClientImpl.java:startDriver(428)) - Running client driver with argv: /usr/local/spark/bin
/spark-submit --properties-file /tmp/spark-submit.255205804744246105.properties --class org.apache.hive.spark.client.RemoteDriver /usr/local/apache-hive-2.0.0-bin/lib/hive-exec-2.0.0.jar --remote-host <masked_hostname> --remote-port 34537 --conf hive.spark.client.connect.timeout=1000 --conf hive.spark.client.server.connect.timeout=90000 --conf hive.spark.client.channel.log.level=null --conf hive.spark.client.rpc.max.size=52428800 --conf hive.spark.client.rpc.threads=8 --conf hive.spark.client.secret.bits=256
2021-10-17T03:06:54,488 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) - Warning: Ignoring non-spark config property: hive.spark.client.server.connect.timeout=90000
2021-10-17T03:06:54,489 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) - Warning: Ignoring non-spark config property: hive.spark.client.rpc.threads=8
2021-10-17T03:06:54,489 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) - Warning: Ignoring non-spark config property: hive.spark.client.connect.timeout=1000
2021-10-17T03:06:54,489 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) - Warning: Ignoring non-spark config property: hive.spark.client.secret.bits=256
2021-10-17T03:06:54,489 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) - Warning: Ignoring non-spark config property: hive.spark.client.rpc.max.size=52428800
2021-10-17T03:06:55,000 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) - java.lang.NoClassDefFoundError: org/apache/spark/JavaSparkListener
2021-10-17T03:06:55,000 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at java.lang.ClassLoader.defineClass1(Native Method)
2021-10-17T03:06:55,000 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at java.lang.ClassLoader.defineClass(ClassLoader.java:756)
2021-10-17T03:06:55,000 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
2021-10-17T03:06:55,000 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at java.net.URLClassLoader.defineClass(URLClassLoader.java:468)
2021-10-17T03:06:55,000 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
2021-10-17T03:06:55,001 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
2021-10-17T03:06:55,001 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
2021-10-17T03:06:55,001 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at java.security.AccessController.doPrivileged(Native Method)
2021-10-17T03:06:55,001 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at java.net.URLClassLoader.findClass(URLClassLoader.java:362)
2021-10-17T03:06:55,001 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
2021-10-17T03:06:55,001 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
2021-10-17T03:06:55,001 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at java.lang.Class.forName0(Native Method)
2021-10-17T03:06:55,001 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at java.lang.Class.forName(Class.java:348)
2021-10-17T03:06:55,001 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at org.apache.spark.util.Utils$.classForName(Utils.scala:235)
2021-10-17T03:06:55,002 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:836)
2021-10-17T03:06:55,002 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:197)
2021-10-17T03:06:55,002 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:227)
2021-10-17T03:06:55,002 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:136)
2021-10-17T03:06:55,002 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
2021-10-17T03:06:55,003 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) - Caused by: java.lang.ClassNotFoundException: org.apache.spark.JavaSparkListener
2021-10-17T03:06:55,003 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
2021-10-17T03:06:55,003 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
2021-10-17T03:06:55,003 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) -        at java.lang.ClassLoader.loadClass(ClassLoader.java:351)

我还添加了spark库到hive类路径使用spark作为hive的执行引擎

有什么建议如何修复上述错误?

我建议使用gradle/maven使用依赖关系图。并排除任何歧义依赖,如果你有。因为看起来依赖项没有正确添加,或者在执行时被依赖的jar版本重叠。

2021-10-17T03:06:55,000 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(593)) - java.lang.NoClassDefFoundError: org/apache/spark/JavaSparkListener

表示Java兼容错误。请检查Java版本。我建议只使用java 1.8版本。还要检查依赖关系。https://issues.apache.org/jira/browse/hive - 14029

最新更新