Artemis:服务器停止给出错误,因为在健康的网络中网络不健康



我们在Red Hat Enterprise Linux Server 7.7上使用Apache Artemis 2.14.0版本(Linux 3.10.0-1062.el7.x86_64 x86_64(。但服务器启动失败给出:

AMQ201001: Network is unhealthy, stopping service ActiveMQServerImpl

我们已经配置了ping命令,并且ping返回成功,没有任何明显的延迟。

Ping输出

ping 10.30.12.21
PING 10.30.12.21 (10.30.12.21) 56(84) bytes of data.
64 bytes from 10.30.12.21: icmp_seq=1 ttl=63 time=0.337 ms
64 bytes from 10.30.12.21: icmp_seq=2 ttl=63 time=0.380 ms

Ping配置

<network-check-period>5000</network-check-period>
<network-check-timeout>1000</network-check-timeout>
<network-check-list>10.30.12.21,10.30.12.22</network-check-list>
<network-check-ping-command>ping -c 1 -t %d %s</network-check-ping-command>

两个ping ip都可以毫无问题地访问。如果我评论ping配置服务器良好。

调试日志

2020-08-12 09:15:17,439 INFO  [org.apache.activemq.artemis.logs] AMQ201001: Network is unhealthy, stopping service ActiveMQServerImpl::serverUUID=8108eab2-dbff-11ea-83e1-eeeeeeeeeeee
2020-08-12 09:15:17,440 DEBUG [org.apache.activemq.artemis.core.cluster.DiscoveryGroup] Stopping discovery. There's an exception just as a trace where it happened: java.lang.Exception: trace
at org.apache.activemq.artemis.core.cluster.DiscoveryGroup.stop(DiscoveryGroup.java:154) [artemis-core-client-2.14.0.jar:2.14.0]
at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.doClose(ServerLocatorImpl.java:1346) [artemis-core-client-2.14.0.jar:2.14.0]
at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.close(ServerLocatorImpl.java:1320) [artemis-core-client-2.14.0.jar:2.14.0]
at org.apache.activemq.artemis.core.server.cluster.ClusterController.stop(ClusterController.java:138) [artemis-server-2.14.0.jar:2.14.0]
at org.apache.activemq.artemis.core.server.cluster.ClusterManager.stop(ClusterManager.java:307) [artemis-server-2.14.0.jar:2.14.0]
at org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.stopComponent(ActiveMQServerImpl.java:1362) [artemis-server-2.14.0.jar:2.14.0]
at org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.stop(ActiveMQServerImpl.java:1097) [artemis-server-2.14.0.jar:2.14.0]
at org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.stop(ActiveMQServerImpl.java:1033) [artemis-server-2.14.0.jar:2.14.0]
at org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.stop(ActiveMQServerImpl.java:876) [artemis-server-2.14.0.jar:2.14.0]
at org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$1.stop(ActiveMQServerImpl.java:360) [artemis-server-2.14.0.jar:2.14.0]
at org.apache.activemq.artemis.core.server.NetworkHealthCheck.run(NetworkHealthCheck.java:296) [artemis-commons-2.14.0.jar:2.14.0]
at org.apache.activemq.artemis.core.server.ActiveMQScheduledComponent$2.run(ActiveMQScheduledComponent.java:306) [artemis-commons-2.14.0.jar:2.14.0]
at org.apache.activemq.artemis.core.server.ActiveMQScheduledComponent$3.run(ActiveMQScheduledComponent.java:316) [artemis-commons-2.14.0.jar:2.14.0]
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [java.base:]
at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) [java.base:]
at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305) [java.base:]
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [java.base:]
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [java.base:]
at org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118) [artemis-commons-2.14.0.jar:2.14.0]
2020-08-12 09:15:17,443 DEBUG [org.apache.activemq.artemis.core.cluster.DiscoveryGroup] endpoing closed
2020-08-12 09:15:17,443 DEBUG [org.apache.activemq.artemis.core.cluster.DiscoveryGroup] receiving 65535
2020-08-12 09:15:17,443 DEBUG [org.apache.activemq.artemis.core.cluster.DiscoveryGroup] Received nodeID 393759d8-dc4e-11ea-a68e-eeeeeeeeeeee with originatingID =
2020-08-12 09:15:17,443 DEBUG [org.apache.activemq.artemis.core.cluster.DiscoveryGroup] Received 0 discovery entry elements
2020-08-12 09:15:17,444 DEBUG [org.apache.activemq.artemis.core.cluster.DiscoveryGroup] changed = false
2020-08-12 09:15:17,444 DEBUG [org.apache.activemq.artemis.core.cluster.DiscoveryGroup] Calling notifyAll
2020-08-12 09:15:17,450 DEBUG [org.apache.activemq.artemis.core.cluster.DiscoveryGroup] receiving 65535
2020-08-12 09:15:17,450 DEBUG [org.apache.activemq.artemis.core.cluster.DiscoveryGroup] ignoring original NodeID8108eab2-dbff-11ea-83e1-eeeeeeeeeeee receivedID = 8108eab2-dbff-11ea-83e1-eeeeeeeeeeee
2020-08-12 09:15:17,450 DEBUG [org.apache.activemq.artemis.core.remoting.server.impl.RemotingServiceImpl] RemotingServiceImpl::removing connection ID b29aee05
2020-08-12 09:15:17,501 DEBUG [org.apache.activemq.artemis.core.postoffice.impl.PostOfficeImpl] Couldn't find any bindings for address=activemq.notifications on message=CoreMessage[messageID=38,durable=true,userID=null,priority=0, timestamp=Wed Aug 12 09:15:17 IST 2020,expiration=0, durable=true, address=activemq.notifications,size=277,properties=TypedProperties[_AMQ_NotifType=BROADCAST_GROUP_STOPPED,name=my-broadcast-group,_AMQ_NotifTimestamp=1597203917490]]@1627279258
2020-08-12 09:15:17,501 DEBUG [org.apache.activemq.artemis.core.postoffice.impl.PostOfficeImpl] Message CoreMessage[messageID=38,durable=true,userID=null,priority=0, timestamp=Wed Aug 12 09:15:17 IST 2020,expiration=0, durable=true, address=activemq.notifications,size=277,properties=TypedProperties[_AMQ_NotifType=BROADCAST_GROUP_STOPPED,name=my-broadcast-group,_AMQ_NotifTimestamp=1597203917490]]@1627279258 is not going anywhere as it didn't have a binding on address:activemq.notifications
2020-08-12 09:15:17,506 DEBUG [org.apache.activemq.artemis.core.postoffice.impl.PostOfficeImpl] Couldn't find any bindings for address=activemq.notifications on message=CoreMessage[messageID=40,durable=true,userID=null,priority=0, timestamp=Wed Aug 12 09:15:17 IST 2020,expiration=0, durable=true, address=activemq.notifications,size=267,properties=TypedProperties[_AMQ_NotifType=CLUSTER_CONNECTION_STOPPED,name=my-cluster,_AMQ_NotifTimestamp=1597203917506]]@1325505535
2020-08-12 09:15:17,506 DEBUG [org.apache.activemq.artemis.core.postoffice.impl.PostOfficeImpl] Message CoreMessage[messageID=40,durable=true,userID=null,priority=0, timestamp=Wed Aug 12 09:15:17 IST 2020,expiration=0, durable=true, address=activemq.notifications,size=267,properties=TypedProperties[_AMQ_NotifType=CLUSTER_CONNECTION_STOPPED,name=my-cluster,_AMQ_NotifTimestamp=1597203917506]]@1325505535 is not going anywhere as it didn't have a binding on address:activemq.notifications
2020-08-12 09:15:17,507 DEBUG [org.apache.activemq.artemis.core.cluster.DiscoveryGroup] Stopping discovery. There's an exception just as a trace where it happened: java.lang.Exception: trace
at org.apache.activemq.artemis.core.cluster.DiscoveryGroup.stop(DiscoveryGroup.java:154) [artemis-core-client-2.14.0.jar:2.14.0]
at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.doClose(ServerLocatorImpl.java:1346) [artemis-core-client-2.14.0.jar:2.14.0]
at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.close(ServerLocatorImpl.java:1320) [artemis-core-client-2.14.0.jar:2.14.0]
at org.apache.activemq.artemis.core.server.cluster.impl.ClusterConnectionImpl.closeLocator(ClusterConnectionImpl.java:467) [artemis-server-2.14.0.jar:2.14.0]
at org.apache.activemq.artemis.core.server.cluster.impl.ClusterConnectionImpl.access$300(ClusterConnectionImpl.java:77) [artemis-server-2.14.0.jar:2.14.0]
at org.apache.activemq.artemis.core.server.cluster.impl.ClusterConnectionImpl$1.run(ClusterConnectionImpl.java:452) [artemis-server-2.14.0.jar:2.14.0]
at org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:42) [artemis-commons-2.14.0.jar:2.14.0]
at org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:31) [artemis-commons-2.14.0.jar:2.14.0]
at org.apache.activemq.artemis.utils.actors.ProcessorBase.executePendingTasks(ProcessorBase.java:65) [artemis-commons-2.14.0.jar:2.14.0]
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [java.base:]
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [java.base:]
at org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118) [artemis-commons-2.14.0.jar:2.14.0]
2020-08-12 09:15:17,511 DEBUG [org.apache.activemq.artemis.core.cluster.DiscoveryGroup] endpoing closed
2020-08-12 09:15:17,523 DEBUG [org.apache.activemq.artemis.core.cluster.DiscoveryGroup] receiving 65535
2020-08-12 09:15:17,523 DEBUG [org.apache.activemq.artemis.core.cluster.DiscoveryGroup] Received nodeID 8108eab2-dbff-11ea-83e1-eeeeeeeeeeee with originatingID =
2020-08-12 09:15:17,523 DEBUG [org.apache.activemq.artemis.core.cluster.DiscoveryGroup] Received 0 discovery entry elements
2020-08-12 09:15:17,523 DEBUG [org.apache.activemq.artemis.core.cluster.DiscoveryGroup] changed = false

更新

网络不正常消息的日志

2020-08-12 09:15:17,439 INFO  [org.apache.activemq.artemis.logs] AMQ201001: Network is unhealthy, stopping service ActiveMQServerImpl::serverUUID=8108eab2-dbff-11ea-83e1-eeeeeeeeeeee
2020-08-12 09:15:17,440 DEBUG [org.apache.activemq.artemis.core.cluster.DiscoveryGroup] Stopping discovery. There's an exception just as a trace where it happened: java.lang.Exception: trace
at org.apache.activemq.artemis.core.cluster.DiscoveryGroup.stop(DiscoveryGroup.java:154) [artemis-core-client-2.14.0.jar:2.14.0]
at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.doClose(ServerLocatorImpl.java:1346) [artemis-core-client-2.14.0.jar:2.14.0]

更新2

org.apache.activemq.artemis.core.server.NetworkHealthCheckTRACE日志

2020-08-13 22:37:31,453 INFO  [org.apache.activemq.artemis] AMQ241001: HTTP Server started at http://10.30.122.241:8161
2020-08-13 22:37:31,454 INFO  [org.apache.activemq.artemis] AMQ241002: Artemis Jolokia REST API available at http://10.30.122.241:8161/console/jolokia
2020-08-13 22:37:31,454 INFO  [org.apache.activemq.artemis] AMQ241004: Artemis Console available at http://10.30.122.241:8161/console
2020-08-13 22:37:33,806 TRACE [org.apache.activemq.artemis.core.server.NetworkHealthCheck] purePing on canonical address 10.30.12.20
2020-08-13 22:37:33,807 DEBUG [org.apache.activemq.artemis.core.server.NetworkHealthCheck] executing ping:: ping -c 1 -t 1 10.30.12.20
2020-08-13 22:37:33,828 DEBUG [org.apache.activemq.artemis.core.server.NetworkHealthCheck] PING 10.30.12.20 (10.30.12.20) 56(84) bytes of data.
2020-08-13 22:37:33,828 DEBUG [org.apache.activemq.artemis.core.server.NetworkHealthCheck] From 10.30.122.253 icmp_seq=1 Time to live exceeded
2020-08-13 22:37:33,828 DEBUG [org.apache.activemq.artemis.core.server.NetworkHealthCheck]
2020-08-13 22:37:33,828 DEBUG [org.apache.activemq.artemis.core.server.NetworkHealthCheck] --- 10.30.12.20 ping statistics ---
2020-08-13 22:37:33,828 DEBUG [org.apache.activemq.artemis.core.server.NetworkHealthCheck] 1 packets transmitted, 0 received, +1 errors, 100% packet loss, time 0ms
2020-08-13 22:37:33,829 DEBUG [org.apache.activemq.artemis.core.server.NetworkHealthCheck]
2020-08-13 22:37:33,829 TRACE [org.apache.activemq.artemis.core.server.NetworkHealthCheck] purePing on canonical address 10.30.12.21
2020-08-13 22:37:33,830 DEBUG [org.apache.activemq.artemis.core.server.NetworkHealthCheck] executing ping:: ping -c 1 -t 1 10.30.12.21
2020-08-13 22:37:33,835 DEBUG [org.apache.activemq.artemis.core.server.NetworkHealthCheck] PING 10.30.12.21 (10.30.12.21) 56(84) bytes of data.
2020-08-13 22:37:33,836 DEBUG [org.apache.activemq.artemis.core.server.NetworkHealthCheck] From 10.30.122.253 icmp_seq=1 Time to live exceeded
2020-08-13 22:37:33,836 DEBUG [org.apache.activemq.artemis.core.server.NetworkHealthCheck]
2020-08-13 22:37:33,836 DEBUG [org.apache.activemq.artemis.core.server.NetworkHealthCheck] --- 10.30.12.21 ping statistics ---
2020-08-13 22:37:33,836 DEBUG [org.apache.activemq.artemis.core.server.NetworkHealthCheck] 1 packets transmitted, 0 received, +1 errors, 100% packet loss, time 0ms
2020-08-13 22:37:33,836 DEBUG [org.apache.activemq.artemis.core.server.NetworkHealthCheck]
2020-08-13 22:37:33,838 INFO  [org.apache.activemq.artemis.logs] AMQ201001: Network is unhealthy, stopping service ActiveMQServerImpl::serverUUID=45ae05a9-dd86-11ea-8ee5-eeeeeeeeeeee
2020-08-13 22:37:33,875 INFO  [org.apache.activemq.artemis.core.server] AMQ221029: stopped bridge $.artemis.internal.sf.my-cluster.869c8d37-dc5c-11ea-aa39-eeeeeeeeeeee
2020-08-13 22:37:34,060 INFO  [org.apache.activemq.artemis.core.server] AMQ221002: Apache ActiveMQ Artemis Message Broker version 2.14.0 [45ae05a9-dd86-11ea-8ee5-eeeeeeeeeeee] stopped, uptime 11.724 seconds
2020-08-13 22:37:39,061 TRACE [org.apache.activemq.artemis.core.server.NetworkHealthCheck] purePing on canonical address 10.30.12.20
2020-08-13 22:37:39,061 DEBUG [org.apache.activemq.artemis.core.server.NetworkHealthCheck] executing ping:: ping -c 1 -t 1 10.30.12.20
2020-08-13 22:37:39,065 DEBUG [org.apache.activemq.artemis.core.server.NetworkHealthCheck] PING 10.30.12.20 (10.30.12.20) 56(84) bytes of data.
2020-08-13 22:37:39,066 DEBUG [org.apache.activemq.artemis.core.server.NetworkHealthCheck] From 10.30.122.253 icmp_seq=1 Time to live exceeded
2020-08-13 22:37:39,066 DEBUG [org.apache.activemq.artemis.core.server.NetworkHealthCheck]
2020-08-13 22:37:39,066 DEBUG [org.apache.activemq.artemis.core.server.NetworkHealthCheck] --- 10.30.12.20 ping statistics ---
2020-08-13 22:37:39,068 DEBUG [org.apache.activemq.artemis.core.server.NetworkHealthCheck] 1 packets transmitted, 0 received, +1 errors, 100% packet loss, time 0ms
2020-08-13 22:37:39,068 DEBUG [org.apache.activemq.artemis.core.server.NetworkHealthCheck]
2020-08-13 22:37:39,068 TRACE [org.apache.activemq.artemis.core.server.NetworkHealthCheck] purePing on canonical address 10.30.12.21
2020-08-13 22:37:39,069 DEBUG [org.apache.activemq.artemis.core.server.NetworkHealthCheck] executing ping:: ping -c 1 -t 1 10.30.12.21
2020-08-13 22:37:39,075 DEBUG [org.apache.activemq.artemis.core.server.NetworkHealthCheck] PING 10.30.12.21 (10.30.12.21) 56(84) bytes of data.
2020-08-13 22:37:39,075 DEBUG [org.apache.activemq.artemis.core.server.NetworkHealthCheck] From 10.30.122.253 icmp_seq=1 Time to live exceeded
2020-08-13 22:37:39,075 DEBUG [org.apache.activemq.artemis.core.server.NetworkHealthCheck]
2020-08-13 22:37:39,075 DEBUG [org.apache.activemq.artemis.core.server.NetworkHealthCheck] --- 10.30.12.21 ping statistics ---
2020-08-13 22:37:39,075 DEBUG [org.apache.activemq.artemis.core.server.NetworkHealthCheck] 1 packets transmitted, 0 received, +1 errors, 100% packet loss, time 0ms

ping地址有1个跃点。

traceroute to 10.30.12.21 (10.30.12.21), 30 hops max, 60 byte packets
1  10.30.122.253 (10.30.122.253)  1.472 ms  1.515 ms  1.527 ms
2  10.30.12.21 (10.30.12.21)  0.282 ms  0.262 ms  0.231 ms

我怀疑您使用的ping命令(即ping -c 1 -t %d %s(实际上对您的操作系统或用例无效。要确认这一点,您需要为org.apache.activemq.artemis.core.server.NetworkHealthCheck激活TRACE日志记录,重现问题,然后检查日志。有关如何为特定记录器启用TRACE日志记录的详细信息,请参阅此问题。

一旦你有了日志,就很容易看出实际的问题是什么。之后,你可以根据用例的需要调整ping命令。

相关内容

  • 没有找到相关文章

最新更新