rancher rke up etcd主机上的错误健康检查远程错误:tls:证书不正确

rke --debug up --config cluster.yml

在etcd主机上进行健康检查失败，并出现错误：

DEBU[0281][etcd]无法检查etcd主机[x.x.x.x]的运行状况：无法获取主机[x.x.x.x]：获取"https://x.x.x.x:2379/health"远程错误：tls：错误的证书

检查etcd健康检查

for endpoint in $(docker exec etcd /bin/sh -c "etcdctl member list | cut -d, -f5"); do
echo "Validating connection to ${endpoint}/health";
curl -w "n" --cacert $(docker exec etcd printenv ETCDCTL_CACERT) --cert $(docker exec etcd printenv ETCDCTL_CERT) --key $(docker exec etcd printenv ETCDCTL_KEY) "${endpoint}/health";
done
Running on that master node
Validating connection to https://x.x.x.x:2379/health
{"health":"true"}
Validating connection to https://x.x.x.x:2379/health
{"health":"true"}
Validating connection to https://x.x.x.x:2379/health
{"health":"true"}
Validating connection to https://x.x.x.x:2379/health
{"health":"true"}

you can run it manually and see if it responds correctly
curl -w "n" --cacert /etc/kubernetes/ssl/kube-ca.pem --cert /etc/kubernetes/ssl/kube-etcd-x-x-x-x.pem --key /etc/kubernetes/ssl/kube-etcd-x-x-x-x-key.pem https://x.x.x.x:2379/health

检查我的自签名证书散列

# md5sum /etc/kubernetes/ssl/kube-ca.pem
f5b358e771f8ae8495c703d09578eb3b  /etc/kubernetes/ssl/kube-ca.pem
# for key in $(cat /home/kube/cluster.rkestate | jq -r '.desiredState.certificatesBundle | keys[]'); do echo $(cat /home/kube/cluster.rkestate | jq -r --arg key $key '.desiredState.certificatesBundle[$key].certificatePEM' | sed '$ d' | md5sum) $key; done | grep kube-ca
f5b358e771f8ae8495c703d09578eb3b - kube-ca

versions on my master node
Debian GNU/Linux 10
rke version v1.3.1
docker version Version: 20.10.8
kubectl v1.21.5
v1.21.5-rancher1-1

我觉得我的cluster.rkestate坏了，还有其他地方可以用rke工具检查证书吗？目前，我无法对此生产集群执行任何操作，并且希望避免停机。我在不同的场景中测试了集群，作为最后的手段，我可以从头开始重新创建集群，但也许我仍然可以修复它。。。rke remove&amp；rke up

rke util get-state-file帮助我重建了坏的cluster.rkestate文件并且我能够成功地CCD_ 6并添加新的主节点来解决整个情况。

问题可以通过以下步骤解决：

删除运行rke up命令的kube_config_cluster.yml文件。(由于K8s节点中缺少一些数据(
删除cluster.rkestate文件。
重新运行rke up命令。

相关内容

最新更新

热门标签：