Spark在Hbase的InputSplit过程中出现Null指针异常



我使用的是Spark 1.2.1、Hbase 0.98.10和Hadoop 2.6.0。在从hbase检索数据时,我遇到了一个空点异常。在下面查找堆栈跟踪。

[sparkDriver akka.aactor.default-dispatcher-2]调试新的HadoopRDD-无法使用InputSplit#getLocationInfo。java.lang.NullPointerException:处为nullscala.collection.mutable.ArrayOps$Ref$.length$扩展(ArrayOps.scala:114)~[scale-library-2.10.4.jar:na]在scala.collection.mutable.ArrayOps$Ref.length(ArrayOps.scala:114)~[scale-library-2.10.4.jar:na]在scala.collection.IndexedSeqOptimized$class.foreach(IndexedSqOptimized.scala:32)~[scale-library-2.10.4.jar:na]在scala.collection.mutable.ArrayOps$Ref.foreach(ArrayOps.scala:108)~[scale-library-2.10.4.jar:na]在org.apache.spark.rdd.HadoopRDD$.convertSplitLocationInfo(HadoopRDD.scala:401)~[火花芯_2.10-1.2.1。罐:1.2.1]org.apache.spark.rdd.NewHadoopRDD.getPreferredLocations(NewHadooopRDD.scala:215)~[火花芯_2.10-1.2.1。罐:1.2.1]org.apache.spark.rdd.rdd$$anonfun$preferredLocations$2.apply(rdd.scala:234)[spark-core_2.10-1.2.1.jar:1.2.1]org.apache.spark.rdd.rdd$$anonfun$preferredLocations$2.apply(rdd.scala:234)[spark-core_2.10-1.2.1.jar:1.2.1]scala。Option.getOrElse(Option.scala:120)[scale-library-2.10.4.jar:na]网址:org.apache.spark.rdd.rdd.preferredLocations(rdd.scala:233)[spark-core_2.10-1.2.1.jar:1.2.1]org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGSchedler$$getPreferredLocsInternal(DAGScheuler.scala:1326)[spark-core_2.10-1.2.1.jar:1.2.1]org.apache.spark.scheduler.DAGScheduler$$anonfun$org.apache$spark$scheduler$DAGScheduler$$getPreferredLocsInternal$2$$anonfon$apply$2.apply$mcVI$sp(DAGScheudler.scala:1336)[spark-core_2.10-1.2.1.jar:1.2.1]org.apache.spark.scheduler.DAGScheduler$$anonfun$org.apache$spark$scheduler$DAGScheduler$$getPreferredLocsInternal$2$$anonfon$apply$2.apply(DAGScheudler.scala:1335)[spark-core_2.10-1.2.1.jar:1.2.1]org.apache.spark.scheduler.DAGScheduler$$anonfun$org.apache$spark$scheduler$DAGScheduler$$getPreferredLocsInternal$2$$anonfon$apply$2.apply(DAGScheudler.scala:1335)[spark-core_2.10-1.2.1.jar:1.2.1]scala.collection.invariable.List.foreach(List.scala:318)[scale-library-2.10.4。jar:na]位于org.apache.spark.scheduler.DAGScheduler$$anonfun$org.apache$spark$scheduler$DAGScheduler$$getPreferredLocsInternal$2.apply(DAGScheudler.scala:1335)[spark-core_2.10-1.2.1.jar:1.2.1]org.apache.spark.scheduler.DAGScheduler$$anonfun$org.apache$spark$scheduler$DAGScheduler$$getPreferredLocsInternal$2.apply[spark-core_2.10-1.2.1.jar:1.2.1]scala.collection.invariable.List.foreach(List.scala:318)[scale-library-2.10.4。jar:na]位于org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGSchedler$$getPreferredLocsInternal(DAGScheuler.scala:11333)[spark-core_2.10-1.2.1.jar:1.2.1]org.apache.spark.scheduler.DAGScheduler$$anonfun$org.apache$spark$scheduler$DAGScheduler$$getPreferredLocsInternal$2$$anonfon$apply$2.apply$mcVI$sp(DAGScheudler.scala:1336)[spark-core_2.10-1.2.1.jar:1.2.1]org.apache.spark.scheduler.DAGScheduler$$anonfun$org.apache$spark$scheduler$DAGScheduler$$getPreferredLocsInternal$2$$anonfon$apply$2.apply(DAGScheudler.scala:1335)[spark-core_2.10-1.2.1.jar:1.2.1]org.apache.spark.scheduler.DAGScheduler$$anonfun$org.apache$spark$scheduler$DAGScheduler$$getPreferredLocsInternal$2$$anonfon$apply$2.apply(DAGScheudler.scala:1335)[spark-core_2.10-1.2.1.jar:1.2.1]scala.collection.invariable.List.foreach(List.scala:318)[scale-library-2.10.4。jar:na]位于org.apache.spark.scheduler.DAGScheduler$$anonfun$org.apache$spark$scheduler$DAGScheduler$$getPreferredLocsInternal$2.apply(DAGScheudler.scala:1335)[spark-core_2.10-1.2.1.jar:1.2.1]org.apache.spark.scheduler.DAGScheduler$$anonfun$org.apache$spark$scheduler$DAGScheduler$$getPreferredLocsInternal$2.apply[spark-core_2.10-1.2.1.jar:1.2.1]scala.collection.invariable.List.foreach(List.scala:318)[scale-library-2.10.4。jar:na]位于org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGSchedler$$getPreferredLocsInternal(DAGScheuler.scala:11333)[spark-core_2.10-1.2.1.jar:1.2.1]org.apache.spark.scheduler.DAGScheduler.getPreferredLocs(DAGScheduler.scala:1304)[spark-core_2.10-1.2.1.jar:1.2.1]org.apache.spark.scheduler.DAGScheduler$$anonfun$17.apply(DAGScheduler.scala:862)[spark-core_2.10-1.2.1.jar:1.2.1]org.apache.spark.scheduler.DAGScheduler$$anonfun$17.apply(DAGScheduler.scala:859)[spark-core_2.10-1.2.1.jar:1.2.1]scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)[scale-library-2.10.4。jar:na]位于scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)[scale-library-2.10.4。jar:na]位于scala.collection.Iterator$class.foreach(Iterator.scala:727)[scale-library-2.10.4。jar:na]位于scala.collection.AbstractIterator.foreach(Iterator.scala:1157)[scale-library-2.10.4。jar:na]位于scala.collection.IterabaleLike$class.foreach(IterableLike.scala:72)[scale-library-2.10.4。jar:na]位于scala.collection.AbstractIterable.foreach(Iterable.scala:54)[scale-library-2.10.4。jar:na]位于scala.collection.TraversableLike$class.map(TraversableLik.scala:244)[scale-library-2.10.4。jar:na]位于scala.collection.AbstractTraversable.map(Traversable.scala:105)[scale-library-2.10.4。jar:na]位于org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGSchedler$$submitMissingTasks(DAGScheuler.scala:859)[spark-core_2.10-1.2.1.jar:1.2.1]org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheudler.scala:778)[spark-core_2.10-1.2.1.jar:1.2.1]org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:762)[spark-core_2.10-1.2.1.jar:1.2.1]org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheudler.scala:1389)[spark-core_2.10-1.2.1.jar:1.2.1]akka.aactor.actor$class.aroundRecive(actor.scala:465)[akka-ator_2.10-2.3.4-spark.jar:na]org.apache.spark.scheduler.DAGSchedulerEventProcessActor.aroundRecive(DAGScheduler.scala:1375)[spark-core_2.10-1.2.1.jar:1.2.1]akka.aactor.ActorCell.rereceiveMessage(ActorCell.scala:516)[akka-ator_2.10-2.3.4-spark.jar:na]akka.aactor.ActorCell.ioke(ActorCell.scala:487)[akka-ator_2.10-2.3.4-spark.jar:na]akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)[akka-ator_2.10-2.3.4-spark.jar:na]akka.dispatch.Mailbox.run(Mailbox.scala:220)[akka-ator_2.10-2.3.4-spark.jar:na]akka.dispatch.FukJoinExecutiutoConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393)[akka-ator_2.10-2.3.4-spark.jar:na]scala.concurrent.fukjoin.FukJoinTask.doExec(ForkJoinTask.java:260)[scale-library-2.10.4。jar:na]位于scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)[scale-library-2.10.4。jar:na]位于scala.concurrent.fukjoin.FukJoinPool.runWorker(ForkJoinPool.java:1979)[scale-library-2.10.4。jar:na]位于scala.concurrent.fukjoin.FukJoinWorkerThread.run(ForkJoinWorker线程.java:107)[scale-library-2.10.4。jar:na]

请为我提供此问题的解决方案。

异常是在getPreferredLocations阶段引发的,因此如果没有关于hbase配置的更多信息,我建议您查看hbase.table.name和hbase.master(如果定义HMaster正确,我不知道最后一个)

相关内容

  • 没有找到相关文章

最新更新