我正试图按照AWS的官方文档和EKS研讨会,在运行centos 7
的EC2实例上运行argocd
,但它处于挂起状态,kube-system
命名空间中的所有pod都运行良好。
以下是kubectl get pods --all-namespaces
的输出
NAMESPACE NAME READY STATUS RESTARTS AGE
argocd argocd-application-controller-5785f6b79-nvg7n 0/1 Pending 0 29s
argocd argocd-dex-server-7f5d7d6645-gprpd 0/1 Pending 0 19h
argocd argocd-redis-cccbb8f7-vb44n 0/1 Pending 0 19h
argocd argocd-repo-server-67ddb49495-pnw5k 0/1 Pending 0 19h
argocd argocd-server-6bcbf7997d-jqqrw 0/1 Pending 0 19h
kube-system calico-kube-controllers-56b44cd6d5-tzgdm 1/1 Running 0 19h
kube-system calico-node-4z9tx 1/1 Running 0 19h
kube-system coredns-f9fd979d6-8d6hm 1/1 Running 0 19h
kube-system coredns-f9fd979d6-p9dq6 1/1 Running 0 19h
kube-system etcd-ip-10-1-3-94.us-east-2.compute.internal 1/1 Running 0 19h
kube-system kube-apiserver-ip-10-1-3-94.us-east-2.compute.internal 1/1 Running 0 19h
kube-system kube-controller-manager-ip-10-1-3-94.us-east-2.compute.internal 1/1 Running 0 19h
kube-system kube-proxy-tkp7k 1/1 Running 0 19h
kube-system kube-scheduler-ip-10-1-3-94.us-east-2.compute.internal 1/1 Running 0 19h
虽然相同的配置在我的本地mac上运行良好,但我已经确保docker
、kubernetes
服务已经启动并运行。尝试删除pod,重新配置argocd,但每次结果都保持不变。
作为ArgoCD
的新手,我无法找出同样的原因。请告诉我哪里出了问题。谢谢
我通过运行发现了问题所在
kubectl describe pods <name> -n argocd
它给出了以FailedScheduling结尾的输出:
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 3m (x5 over 7m2s) default-scheduler 0/1 nodes are available: 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate.
从此以后,通过引用这个GitHub问题,我发现要运行:
kubectl taint nodes --all node-role.kubernetes.io/master-
在该命令之后,pod开始工作,并从Pending
状态转换为Running
,kubectl describe pods
的输出显示为:
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 3m (x5 over 7m2s) default-scheduler 0/1 nodes are available: 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate.
Normal Scheduled 106s default-scheduler Successfully assigned argocd/argocd-server-7d44dfbcc4-qfj6m to ip-XX-XX-XX-XX.<region>.compute.internal
Normal Pulling 105s kubelet Pulling image "argoproj/argocd:v1.7.6"
Normal Pulled 81s kubelet Successfully pulled image "argoproj/argocd:v1.7.6" in 23.779457251s
Normal Created 72s kubelet Created container argocd-server
Normal Started 72s kubelet Started container argocd-server
从这个错误和解决方案中,我学会了始终使用kubectl describe pods
来解决错误。