rabbitmq-群集主机名匹配错误问题



我已经部署了带有3节点rabbitmq-cluster的openstack-ansible,它使用lxc在顶部运行rabbitmq,当我执行rabbitmqctl status命令时,我在这里看到了非常奇怪的错误,如果你注意到它与错误的节点ostack-controller-01是主机节点,而不是实际的rabbitmq节点。。

[root@ostack-controller-01-rabbit-mq-container-1bf6ede2 ~]# rabbitmqctl status
Status of node 'rabbit@ostack-controller-01' ...
Error: unable to connect to node 'rabbit@ostack-controller-01': nodedown
DIAGNOSTICS
===========
attempted to contact: ['rabbit@ostack-controller-01']
rabbit@ostack-controller-01:
* unable to connect to epmd (port 4369) on ostack-controller-01: address (cannot connect to host/port)
current node details:
- node name: 'rabbitmq-cli-06@ostack-controller-01-rabbit-mq-container-1bf6ede2'
- home dir: /var/lib/rabbitmq
- cookie hash: SssFdXBI7wTevePuCt5d9w==

我如何修复这种行为并告诉rabbitmq与正确的主机ostack-controller-01-rabbit-mq-container-1bf6ede2对话

我试过forget_cluster_node,但运气不好,仍然会出现同样的错误。

[root@ostack-controller-01-rabbit-mq-container-1bf6ede2 ~]# rabbitmqctl forget_cluster_node rabbit@ostack-controller-01
Removing node 'rabbit@ostack-controller-01' from cluster ...
Error: unable to connect to node 'rabbit@ostack-controller-01': nodedown
DIAGNOSTICS
===========
attempted to contact: ['rabbit@ostack-controller-01']
rabbit@ostack-controller-01:
* unable to connect to epmd (port 4369) on ostack-controller-01: address (cannot connect to host/port)

current node details:
- node name: 'rabbitmq-cli-39@ostack-controller-01-rabbit-mq-container-1bf6ede2'
- home dir: /var/lib/rabbitmq
- cookie hash: SssFdXBI7wTevePuCt5d9w==

更新:1

[root@ostack-controller-01-rabbit-mq-container-1bf6ede2 rabbitmq]# rabbitmqctl -n rabbit@ostack-controller-01-rabbit-mq-container-1bf6ede2 status
Status of node 'rabbit@ostack-controller-01-rabbit-mq-container-1bf6ede2' ...
[{pid,8720},
{running_applications,
[{rabbitmq_management,"RabbitMQ Management Console","3.6.9"},
{amqp_client,"RabbitMQ AMQP Client","3.6.9"},
{rabbitmq_management_agent,"RabbitMQ Management Agent","3.6.9"},
{rabbitmq_web_dispatch,"RabbitMQ Web Dispatcher","3.6.9"},
{rabbit,"RabbitMQ","3.6.9"},
{rabbit_common,
"Modules shared by rabbitmq-server and rabbitmq-erlang-client",
"3.6.9"},
{xmerl,"XML parser","1.3.14"},
{os_mon,"CPO  CXC 138 46","2.4.2"},
{cowboy,"Small, fast, modular HTTP server.","1.0.4"},
{ranch,"Socket acceptor pool for TCP protocols.","1.3.0"},
{ssl,"Erlang/OTP SSL application","8.1.3.1.1"},
{public_key,"Public key infrastructure","1.4"},
{cowlib,"Support library for manipulating Web protocols.","1.0.2"},
{crypto,"CRYPTO","3.7.4"},
{inets,"INETS  CXC 138 49","6.3.9"},
{compiler,"ERTS  CXC 138 10","7.0.4.1"},
{asn1,"The Erlang ASN1 compiler version 4.0.4","4.0.4"},
{syntax_tools,"Syntax tools","2.1.1"},
{mnesia,"MNESIA  CXC 138 12","4.14.3.1"},
{sasl,"SASL  CXC 138 11","3.0.3"},
{stdlib,"ERTS  CXC 138 10","3.3"},
{kernel,"ERTS  CXC 138 10","5.2.0.1"}]},
{os,{unix,linux}},
{erlang_version,
"Erlang/OTP 19 [erts-8.3.5.4] [source] [64-bit] [smp:6:6] [async-threads:128] [hipe] [kernel-poll:true]n"},
{memory,
[{total,64189296},
{connection_readers,179280},
{connection_writers,26568},
{connection_channels,124504},
{connection_other,127440},
{queue_procs,2832},
{queue_slave_procs,0},
{plugins,406280},
{other_proc,21056136},
{mnesia,500680},
{metrics,205984},
{mgmt_db,127256},
{msg_index,47416},
{other_ets,2692192},
{binary,1591656},
{code,24765630},
{atom,1033401},
{other_system,11505193}]},
{alarms,[]},
{listeners,
[{clustering,25672,"::"},
{amqp,5672,"::"},
{'amqp/ssl',5671,"::"},
{http,15672,"::"}]},
{vm_memory_high_watermark,0.4},
{vm_memory_limit,6662953369},
{disk_free_limit,50000000},
{disk_free,82822516736},
{file_descriptors,
[{total_limit,65436},
{total_used,5},
{sockets_limit,58890},
{sockets_used,3}]},
{processes,[{limit,1048576},{used,376}]},
{run_queue,0},
{uptime,14},
{kernel,{net_ticktime,60}}]

更新-2

这很有趣。。。为什么以下命令有效但rabbitmqctl cluster_status无效?

[root@ostack-controller-01-rabbit-mq-container-1bf6ede2 rabbitmq]# rabbitmqctl -n rabbit@ostack-controller-01-rabbit-mq-container-1bf6ede2 cluster_status
Cluster status of node 'rabbit@ostack-controller-01-rabbit-mq-container-1bf6ede2' ...
[{nodes,
[{disc,
['rabbit@ostack-controller-01-rabbit-mq-container-1bf6ede2',
'rabbit@ostack-controller-02-rabbit-mq-container-d510bdfc',
'rabbit@ostack-controller-03-rabbit-mq-container-c482ee13']}]},
{running_nodes,
['rabbit@ostack-controller-02-rabbit-mq-container-d510bdfc',
'rabbit@ostack-controller-03-rabbit-mq-container-c482ee13',
'rabbit@ostack-controller-01-rabbit-mq-container-1bf6ede2']},
{cluster_name,<<"openstack">>},
{partitions,
[{'rabbit@ostack-controller-02-rabbit-mq-container-d510bdfc',
['rabbit@ostack-controller-01-rabbit-mq-container-1bf6ede2',
'rabbit@ostack-controller-03-rabbit-mq-container-c482ee13']},
{'rabbit@ostack-controller-03-rabbit-mq-container-c482ee13',
['rabbit@ostack-controller-02-rabbit-mq-container-d510bdfc']}]},
{alarms,
[{'rabbit@ostack-controller-02-rabbit-mq-container-d510bdfc',[]},
{'rabbit@ostack-controller-03-rabbit-mq-container-c482ee13',[]},
{'rabbit@ostack-controller-01-rabbit-mq-container-1bf6ede2',[]}]}]

首先,RabbitMQ 3.6.9是旧版本,您应该使用最新版本。

话虽如此,这不是问题所在。echo $HOSTNAME的输出为:

ostack-controller-01.foo.example.com

因此,当rabbitmqctl status运行时,它使用此代码来确定要连接到的节点名称。由于设置了HOSTNAME变量,用于确定节点名称,而rabbitmqctl尝试使用rabbit@ostack-controller-01,但失败了。

您可以继续对rabbitmqctl使用-n rabbit@ostack-controller-01-rabbit-mq-container-1bf6ede2参数来解决此问题。或者,您可以创建包含以下内容的/etc/rabbitmq/rabbitmq-env.conf文件:

NODENAME=rabbit@ostack-controller-01-rabbit-mq-container-1bf6ede2

然后,rabbitmqctl status和其他rabbitmqctl命令应该可以工作。然后,您将在每个节点上重复此过程,在/etc/rabbitmq/rabbitmq-env.conf中使用该节点的正确名称


注意:RabbitMQ团队监控rabbitmq-users邮件列表,有时只回答StackOverflow上的问题

最新更新