Corda节点在Artemis MessagingClient失败后崩溃,"Artemis MessagingClient failed. Shutting down."



使用CordaOSS4.3运行2个节点和一个公证器时发生以下错误(亚马逊EFS用于每个节点和公证器的Artemis服务(。

・nodeA

[INFO ] 2021-03-24T01:53:33,526Z [nioEventLoopGroup-2-1] engine.ConnectionStateMachine. - Transport Error TransportImpl [_connectionEndpoint=org.apache.qpid.proton.engine.impl.ConnectionImpl@d8755f, org.apache.qpid.proton.engine.impl.TransportImpl@720cb721] {localLegalName=O=nodeA, L=Local, C=JP, remoteLegalName=O=nodeB, L=Local, C=JP, serverMode=false}
[INFO ] 2021-03-24T01:53:33,526Z [nioEventLoopGroup-2-1] engine.ConnectionStateMachine. - Error: connection aborted {localLegalName=O=nodeA, L=Local, C=JP, remoteLegalName=O=nodeB, L=Local, C=JP, serverMode=false}
[INFO ] 2021-03-24T01:53:33,527Z [nioEventLoopGroup-2-1] netty.AMQPClient. - Disconnected from [NLBendpoint]:10005
[INFO ] 2021-03-24T01:53:33,527Z [nioEventLoopGroup-2-1] netty.AMQPChannelHandler. - Closed client connection 828af8c0 from [NLBendpoint]:10005 to /xx.xx.x.xx:40438 {allowedRemoteLegalNames=O=nodeB, L=Local, C=JP, localCert=O=nodeA, L=Local, C=JP, remoteAddress=[NLBendpoint]:10005, remoteCert=O=nodeB, L=Local, C=JP, serverMode=false}
[INFO ] 2021-03-24T01:53:33,527Z [nioEventLoopGroup-2-1] bridging.AMQPBridgeManager$AMQPBridge. - Bridge Disconnected {legalNames=O=nodeB, L=Local, C=JP, maxMessageSize=10485760, queueName=internal.peers.DLB29JcZp4kCP2aGGZKGkhw2X5RenndTjEK4xy48iT9643, targets=[NLBendpoint]:10005}
[WARN ] 2021-03-24T01:55:59,747Z [Thread-17936 (ActiveMQ-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$5@2936f48a)] core.client. - AMQ212037: Connection failure has been detected: AMQ119014: Did not receive data from /xxx.0.0.1:53166 within the 60,000ms connection TTL. The connection will now be closed. [code=CONNECTION_TIMEDOUT]
[WARN ] 2021-03-24T01:55:59,748Z [Thread-949 (ActiveMQ-client-global-threads)] core.client. - AMQ212037: Connection failure has been detected: AMQ119011: Did not receive data from server for org.apache.activemq.artemis.core.remoting.impl.netty.NettyConnection@6eb6efd1[ID=e834052e, local= /127.0.0.1:53170, remote=localhost/127.0.0.1:10008] [code=CONNECTION_TIMEDOUT]
[WARN ] 2021-03-24T01:55:59,751Z [Thread-948 (ActiveMQ-client-global-threads)] core.client. - AMQ212037: Connection failure has been detected: AMQ119011: Did not receive data from server for org.apache.activemq.artemis.core.remoting.impl.netty.NettyConnection@505dd5b8[ID=f1885302, local= /127.0.0.1:53166, remote=localhost/127.0.0.1:10008] [code=CONNECTION_TIMEDOUT]
[WARN ] 2021-03-24T01:55:59,751Z [Thread-950 (ActiveMQ-client-global-threads)] core.client. - AMQ212037: Connection failure has been detected: AMQ119011: Did not receive data from server for org.apache.activemq.artemis.core.remoting.impl.netty.NettyConnection@57579387[ID=718e48b8, local= /127.0.0.1:53168, remote=localhost/127.0.0.1:10008] [code=CONNECTION_TIMEDOUT]
[WARN ] 2021-03-24T01:55:59,774Z [nioEventLoopGroup-2-1] netty.AMQPChannelHandler. - Closing channel due to nonrecoverable exception AMQ119014: Timed out after waiting 30,000 ms for response when sending packet 68 {allowedRemoteLegalNames=O=nodeB, L=Local, C=JP, localCert=O=nodeA, L=Local, C=JP, remoteAddress=[NLBendpoint]:10005, remoteCert=O=nodeB, L=Local, C=JP, serverMode=false}
[INFO ] 2021-03-24T01:55:59,775Z [nioEventLoopGroup-2-1] netty.AMQPClient. - Retry connect to [NLBendpoint]:10005
[ERROR] 2021-03-24T01:55:59,779Z [Thread-612] errorAndTerminate. - ArtemisMessagingClient failed. Shutting down.

・公证

[INFO ] 2021-03-24T01:53:34,850Z [nioEventLoopGroup-2-4] engine.ConnectionStateMachine. - Transport Error TransportImpl [_connectionEndpoint=org.apache.qpid.proton.engine.impl.ConnectionImpl@1a1be565, org.apache.qpid.proton.engine.impl.TransportImpl@1e6940e2] {localLegalName=O=Notary1, L=Local, C=JP, remoteLegalName=O=nodeA, L=Local, C=JP, serverMode=false}
[INFO ] 2021-03-24T01:53:34,850Z [nioEventLoopGroup-2-4] engine.ConnectionStateMachine. - Error: connection aborted {localLegalName=O=Notary1, L=Local, C=JP, remoteLegalName=O=nodeA, L=Local, C=JP, serverMode=false}
[INFO ] 2021-03-24T01:53:34,851Z [nioEventLoopGroup-2-4] netty.AMQPClient. - Disconnected from [NLBendpoint]:10008
[INFO ] 2021-03-24T01:53:34,851Z [nioEventLoopGroup-2-4] netty.AMQPChannelHandler. - Closed client connection 9da3b393 from [NLBendpoint]:10008 to /xx.xx.x.xx:33438 {allowedRemoteLegalNames=O=nodeA, L=Local, C=JP, localCert=O=Notary1, L=Local, C=JP, remoteAddress=[NLBendpoint]:10008, remoteCert=O=nodeA, L=Local, C=JP, serverMode=false}
[INFO ] 2021-03-24T01:53:34,851Z [nioEventLoopGroup-2-4] bridging.AMQPBridgeManager$AMQPBridge. - Bridge Disconnected {legalNames=O=nodeA, L=Local, C=JP, maxMessageSize=10485760, queueName=internal.peers.DLHVntq87Ai3vLSuQzG8BoKcc2napU6aU3NPVFwiF73322, targets=[NLBendpoint]:10008}
[INFO ] 2021-03-24T01:54:03,123Z [nioEventLoopGroup-2-3] netty.AMQPClient. - Retry connect to [NLBendpoint]:10005
[WARN ] 2021-03-24T01:54:17,939Z [nioEventLoopGroup-2-2] netty.AMQPChannelHandler. - SSL Handshake timed out {allowedRemoteLegalNames=O=nodeA, L=Local, C=JP, localCert=null, remoteAddress=[NLBendpoint]:10008, remoteCert=null, serverMode=false}
[ERROR] 2021-03-24T01:54:17,939Z [nioEventLoopGroup-2-2] netty.AMQPChannelHandler. - Handshake failure handshake timed out {allowedRemoteLegalNames=O=nodeA, L=Local, C=JP, localCert=null, remoteAddress=[NLBendpoint]:10008, remoteCert=null, serverMode=false}
[INFO ] 2021-03-24T01:56:11,385Z [nioEventLoopGroup-2-2] netty.AMQPClient. - Retry connect to [NLBendpoint]:10005
[INFO ] 2021-03-24T01:56:11,392Z [nioEventLoopGroup-2-3] netty.AMQPClient. - Failed to connect to [NLBendpoint]:10005
[INFO ] 2021-03-24T01:56:13,393Z [nioEventLoopGroup-2-4] netty.AMQPClient. - Retry connect to [NLBendpoint]:10005
[INFO ] 2021-03-24T01:56:13,398Z [nioEventLoopGroup-2-1] netty.AMQPClient. - Failed to connect to [NLBendpoint]:10005

输出这些日志后,nodeA进程就停止了。(公证程序仍在运行(这个问题的原因可能是什么?我怀疑,由于连接到Amazon EFS时出现问题,与Artemis服务的连接已丢失,因为这些都是在操作系统日志中输出的。

Mar 24 10:55:51 [serverName] stunnel: LOG5[4]: Connection reset: 1105153036 byte(s) sent to TLS, 839120060 byte(s) sent to socket
Mar 24 10:55:54 [serverName] stunnel: LOG5[5]: Service [efs] accepted connection from xxx.x.x.x:38710
Mar 24 10:55:54 [serverName] stunnel: LOG5[5]: s_connect: connected xx.xx.x.xx:2049
Mar 24 10:55:54 [serverName] stunnel: LOG5[5]: Service [efs] connected remote server from xx.xx.x.xx:51468
Mar 24 10:55:55 [serverName] stunnel: LOG5[5]: Certificate accepted at depth=0: CN=*.efs.ap-northeast-1.amazonaws.com
Mar 24 10:55:55 [serverName] stunnel: LOG3[5]: transfer: s_poll_wait: TIMEOUTclose exceeded: closing
Mar 24 10:55:55 [serverName] stunnel: LOG5[5]: Connection closed: 0 byte(s) sent to TLS, 0 byte(s) sent to socket
Mar 24 10:55:55 [serverName] stunnel: LOG5[6]: Service [efs] accepted connection from xxx.x.x.x:38716
Mar 24 10:55:55 [serverName] stunnel: LOG5[6]: s_connect: connected xx.xx.x.xx2049
Mar 24 10:55:55 [serverName] stunnel: LOG5[6]: Service [efs] connected remote server from xx.xx.x.xx:51474

我相信我们在slack上讨论过这个问题,但是的,如果你启动一个corda节点,它不能绑定到p2p端口或p2pAddress。这可能会导致你所描述的artemis错误。

您的网络安全组中也可能发生了一些奇怪的事情。请确保您能够在本地计算机上进行此操作,并且节点都可以在您期望的端口上相互ping/telnet。

最新更新