无法让ArgoCD在运行centos 7的EC2上工作



我正试图按照AWS的官方文档和EKS研讨会,在运行centos 7的EC2实例上运行argocd,但它处于挂起状态,kube-system命名空间中的所有pod都运行良好。

以下是kubectl get pods --all-namespaces的输出

NAMESPACE     NAME                                                              READY   STATUS    RESTARTS   AGE
argocd        argocd-application-controller-5785f6b79-nvg7n                     0/1     Pending   0          29s
argocd        argocd-dex-server-7f5d7d6645-gprpd                                0/1     Pending   0          19h
argocd        argocd-redis-cccbb8f7-vb44n                                       0/1     Pending   0          19h
argocd        argocd-repo-server-67ddb49495-pnw5k                               0/1     Pending   0          19h
argocd        argocd-server-6bcbf7997d-jqqrw                                    0/1     Pending   0          19h
kube-system   calico-kube-controllers-56b44cd6d5-tzgdm                          1/1     Running   0          19h
kube-system   calico-node-4z9tx                                                 1/1     Running   0          19h
kube-system   coredns-f9fd979d6-8d6hm                                           1/1     Running   0          19h
kube-system   coredns-f9fd979d6-p9dq6                                           1/1     Running   0          19h
kube-system   etcd-ip-10-1-3-94.us-east-2.compute.internal                      1/1     Running   0          19h
kube-system   kube-apiserver-ip-10-1-3-94.us-east-2.compute.internal            1/1     Running   0          19h
kube-system   kube-controller-manager-ip-10-1-3-94.us-east-2.compute.internal   1/1     Running   0          19h
kube-system   kube-proxy-tkp7k                                                  1/1     Running   0          19h
kube-system   kube-scheduler-ip-10-1-3-94.us-east-2.compute.internal            1/1     Running   0          19h

虽然相同的配置在我的本地mac上运行良好,但我已经确保dockerkubernetes服务已经启动并运行。尝试删除pod,重新配置argocd,但每次结果都保持不变。

作为ArgoCD的新手,我无法找出同样的原因。请告诉我哪里出了问题。谢谢

我通过运行发现了问题所在

kubectl describe pods <name>  -n argocd

它给出了以FailedScheduling结尾的输出:

...
Events:
Type     Reason            Age                From               Message
----     ------            ----               ----               -------
Warning  FailedScheduling  3m (x5 over 7m2s)  default-scheduler  0/1 nodes are available: 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate.

从此以后,通过引用这个GitHub问题,我发现要运行:

kubectl taint nodes --all node-role.kubernetes.io/master-

在该命令之后,pod开始工作,并从Pending状态转换为Runningkubectl describe pods的输出显示为:

...
Events:
Type     Reason            Age                From               Message
----     ------            ----               ----               -------
Warning  FailedScheduling  3m (x5 over 7m2s)  default-scheduler  0/1 nodes are available: 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate.
Normal   Scheduled         106s               default-scheduler  Successfully assigned argocd/argocd-server-7d44dfbcc4-qfj6m to ip-XX-XX-XX-XX.<region>.compute.internal
Normal   Pulling           105s               kubelet            Pulling image "argoproj/argocd:v1.7.6"
Normal   Pulled            81s                kubelet            Successfully pulled image "argoproj/argocd:v1.7.6" in 23.779457251s
Normal   Created           72s                kubelet            Created container argocd-server
Normal   Started           72s                kubelet            Started container argocd-server

从这个错误和解决方案中,我学会了始终使用kubectl describe pods来解决错误。

最新更新