我有2个与Spark Worker共处的服务器节点。我正在使用共享的IGNITE RDD来保存我的数据框架。当我仅使用一个服务器节点凝视时,我的代码正常工作,如果我启动两个服务器节点代码都会失败,
网格处于无效状态,可以执行此操作。它要么尚未启动,要么已经停止了[gridName = null,state = spot]
DiscoverySpi的配置如下
<property name="discoverySpi">
<bean class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
<property name="ipFinder">
<!--
Ignite provides several options for automatic discovery that can be used
instead os static IP based discovery. For information on all options refer
to our documentation: http://apacheignite.readme.io/docs/cluster-config
-->
<!-- Uncomment static IP finder to enable static-based discovery of initial nodes. -->
<bean class="org.apache.ignite.spi.discovery.tcp.ipfinder.vm.TcpDiscoveryVmIpFinder">
<!--<bean class="org.apache.ignite.spi.discovery.tcp.ipfinder.multicast.TcpDiscoveryMulticastIpFinder">-->
<property name="shared" value="true"/>
<property name="addresses">
<list>
<!-- In distributed environment, replace with actual host IP address. -->
<value>v-in-spark-01:47500..47509</value>
<value>v-in-spark-02:47500..47509</value>
</list>
</property>
</bean>
</property>
</bean>
</property>
我知道这个例外通常意味着点燃instanace既没有启动或停止并尝试进行操作,但是我认为没有单个服务器节点的原因,它可以正常工作,而且我也没有明确关闭Ignite我的程序中的实例。
在我的代码流中,我确实在交易中执行操作,所以就像
- 创建cache1:工作正常
- 创建cache2:工作正常
- 将值放在cache1中;正常工作
- cache2上的igniterdd.savevalues:此步骤失败,上述例外。
使用此链接以完成错误跟踪由零件引起的下面也粘贴了
Caused by: java.lang.IllegalStateException: Grid is in invalid state to perform this operation. It either not started yet or has already being or have stopped [gridName=null, state=STOPPING]
at org.apache.ignite.internal.GridKernalGatewayImpl.illegalState(GridKernalGatewayImpl.java:190)
at org.apache.ignite.internal.GridKernalGatewayImpl.readLock(GridKernalGatewayImpl.java:90)
at org.apache.ignite.internal.IgniteKernal.guard(IgniteKernal.java:3151)
at org.apache.ignite.internal.IgniteKernal.getOrCreateCache(IgniteKernal.java:2739)
at org.apache.ignite.spark.impl.IgniteAbstractRDD.ensureCache(IgniteAbstractRDD.scala:39)
at org.apache.ignite.spark.IgniteRDD$$anonfun$saveValues$1.apply(IgniteRDD.scala:164)
at org.apache.ignite.spark.IgniteRDD$$anonfun$saveValues$1.apply(IgniteRDD.scala:161)
at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$28.apply(RDD.scala:883)
at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$28.apply(RDD.scala:883)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1897)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1897)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
at org.apache.spark.scheduler.Task.run(Task.scala:85)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
... 3 more</pre>
看起来,在您仍在尝试运行作业时,由于某些原因,嵌入在执行程序过程中的节点被停止。据我所知,发生这种情况的唯一方法是停止执行者流程。会这样吗?日志中是否有任何内容?