用于测试,我想缩小3个节点群集到2个节点,以便为我的5个节点簇做同样的事情。
但是,在遵循缩小集群的最佳实践之后:
- 备份所有表
- 对于所有表:
之前小于2alter table xyz set (number_of_replicas=2)
如果在SET GLOBAL PERSISTENT discovery.zen.minimum_master_nodes = <half of the cluster + 1>;
3 a。如果数据检查始终是绿色的,请将Min_available设置为"完整": https://crate.io/docs/reference/configuration.html#graceful-stop- 在一个节点上启动优雅停止
- 等待数据检查转动绿色
- 重复3。
- 完成后,将
crate.yml
中的节点配置持续。gateway.recover_after_nodes: n discovery.zen.minimum_master_nodes:[![enter image description here][1]][1] (n/2) +1 gateway.expected_nodes: n
我的群集再也没有回到"绿色",而且我也有关键的节点检查失败。
这里出了什么问题?
crate.yml:
...
################################## Discovery ##################################
# Discovery infrastructure ensures nodes can be found within a cluster
# and master node is elected. Multicast discovery is the default.
# Set to ensure a node sees M other master eligible nodes to be considered
# operational within the cluster. Its recommended to set it to a higher value
# than 1 when running more than 2 nodes in the cluster.
#
# We highly recommend to set the minimum master nodes as follows:
# minimum_master_nodes: (N / 2) + 1 where N is the cluster size
# That will ensure a full recovery of the cluster state.
#
discovery.zen.minimum_master_nodes: 2
# Set the time to wait for ping responses from other nodes when discovering.
# Set this option to a higher value on a slow or congested network
# to minimize discovery failures:
#
# discovery.zen.ping.timeout: 3s
#
# Time a node is waiting for responses from other nodes to a published
# cluster state.
#
# discovery.zen.publish_timeout: 30s
# Unicast discovery allows to explicitly control which nodes will be used
# to discover the cluster. It can be used when multicast is not present,
# or to restrict the cluster communication-wise.
# For example, Amazon Web Services doesn't support multicast discovery.
# Therefore, you need to specify the instances you want to connect to a
# cluster as described in the following steps:
#
# 1. Disable multicast discovery (enabled by default):
#
discovery.zen.ping.multicast.enabled: false
#
# 2. Configure an initial list of master nodes in the cluster
# to perform discovery when new nodes (master or data) are started:
#
# If you want to debug the discovery process, you can set a logger in
# 'config/logging.yml' to help you doing so.
#
################################### Gateway ###################################
# The gateway persists cluster meta data on disk every time the meta data
# changes. This data is stored persistently across full cluster restarts
# and recovered after nodes are started again.
# Defines the number of nodes that need to be started before any cluster
# state recovery will start.
#
gateway.recover_after_nodes: 3
# Defines the time to wait before starting the recovery once the number
# of nodes defined in gateway.recover_after_nodes are started.
#
#gateway.recover_after_time: 5m
# Defines how many nodes should be waited for until the cluster state is
# recovered immediately. The value should be equal to the number of nodes
# in the cluster.
#
gateway.expected_nodes: 3
,所以有两件事很重要:
- 复制品的数量本质上是您可以在典型设置中失去的节点的数量(建议使用2个节点,以便您可以在过程中缩放和松开节点,但仍然可以) )
- 建议将过程用于簇> 2个节点;)
cratedb将以无复制和主共享节点的方式自动在群集上分布碎片。如果是不可能的(如果您有2个节点和1个带有2个副本的主题,则数据检查将永远不会返回"绿色"。因此,在您的情况下,将复制品数设置为1集群回到绿色(alter table mytable set (number_of_replicas = 1)
)。
关键节点检查是由于群集尚未收到更新的板条。由于CRATEDB仅在启动时加载Expection_nodes(这不是运行时设置),因此需要重新启动整个群集以结论缩小。可以通过滚动重新启动来完成,但是请确保正确设置SET GLOBAL PERSISTENT discovery.zen.minimum_master_nodes = <half of the cluster + 1>;
,否则共识将无法正常工作...
另外,建议逐一缩小一对一的规模,以避免将集群重载与重新平衡并意外丢失数据。