我在本地服务器上有一个Kubernetes集群,我也有一个服务器在Naver Cloud上,让它称为server A
,我想加入我的server A
到我的Kubernetes集群,服务器可以正常加入,但是从守护进程派生的kube-proxy
和kube-flannel
pod一直处于CrashLoopBackOff
状态
下面是kube-proxy
I0405 03:13:48.566285 1 node.go:163] Successfully retrieved node IP: 10.1.0.2
I0405 03:13:48.566382 1 server_others.go:109] "Detected node IP" address="10.1.0.2"
I0405 03:13:48.566420 1 server_others.go:535] "Using iptables proxy"
I0405 03:13:48.616989 1 server_others.go:176] "Using iptables Proxier"
I0405 03:13:48.617021 1 server_others.go:183] "kube-proxy running in dual-stack mode" ipFamily=IPv4
I0405 03:13:48.617040 1 server_others.go:184] "Creating dualStackProxier for iptables"
I0405 03:13:48.617063 1 server_others.go:465] "Detect-local-mode set to ClusterCIDR, but no IPv6 cluster CIDR defined, , defaulting to no-op detect-local for IPv6"
I0405 03:13:48.617093 1 proxier.go:242] "Setting route_localnet=1 to allow node-ports on localhost; to change this either disable iptables.localhostNodePorts (--iptables-localhost-nodeports) or set nodePortAddresses (--nodeport-addresses) to filter loopback addresses"
I0405 03:13:48.617420 1 server.go:655] "Version info" version="v1.26.0"
I0405 03:13:48.617435 1 server.go:657] "Golang settings" GOGC="" GOMAXPROCS="" GOTRACEBACK=""
I0405 03:13:48.618790 1 conntrack.go:52] "Setting nf_conntrack_max" nf_conntrack_max=131072
没有kube-flannel
的日志,kube-flannel
pod在其名为install-cni-plugin
的Init容器上失败,当我尝试kubectl -n kube-flannel logs kube-flannel-ds-d2l4q -c install-cni-plugin
时,它返回
unable to retrieve container logs for docker://47e4c8c580474b384b128c8e4d74297a0e891b5f227c6313146908b06ee7b376
我想不出其他的线索,如果我需要附加更多的信息,请告诉我
请帮帮我,我已经被困太久了。
更多信息:
kubectl get nodes
NAME STATUS ROLES AGE VERSION
accio-randi-ed05937533 Ready <none> 8d v1.26.3
accio-test-1-b3fb4331ee NotReady <none> 89m v1.26.3
master Ready control-plane 48d v1.26.1
kubectl -n kube-system get pods
NAME READY STATUS RESTARTS AGE
coredns-787d4945fb-rms6t 1/1 Running 0 30d
coredns-787d4945fb-t6g8s 1/1 Running 0 33d
etcd-master 1/1 Running 168 (36d ago) 48d
kube-apiserver-master 1/1 Running 158 (36d ago) 48d
kube-controller-manager-master 1/1 Running 27 (6d17h ago) 48d
kube-proxy-2r8tn 1/1 Running 6 (36d ago) 48d
kube-proxy-f997t 0/1 CrashLoopBackOff 39 (90s ago) 87m
kube-proxy-wc9x5 1/1 Running 0 8d
kube-scheduler-master 1/1 Running 27 (6d17h ago) 48d
kubectl -n kube-system get events
LAST SEEN TYPE REASON OBJECT MESSAGE
42s Warning DNSConfigForming pod/coredns-787d4945fb-rms6t Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 10.8.0.1 192.168.18.1 fe80::1%3
54s Warning DNSConfigForming pod/coredns-787d4945fb-t6g8s Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 10.8.0.1 192.168.18.1 fe80::1%3
3m10s Warning DNSConfigForming pod/etcd-master Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 10.8.0.1 192.168.18.1 fe80::1%3
2m48s Warning DNSConfigForming pod/kube-apiserver-master Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 10.8.0.1 192.168.18.1 fe80::1%3
3m33s Warning DNSConfigForming pod/kube-controller-manager-master Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 10.8.0.1 192.168.18.1 fe80::1%3
3m7s Warning DNSConfigForming pod/kube-proxy-2r8tn Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 10.8.0.1 192.168.18.1 fe80::1%3
15s Normal SandboxChanged pod/kube-proxy-f997t Pod sandbox changed, it will be killed and re-created.
5m15s Warning BackOff pod/kube-proxy-f997t Back-off restarting failed container kube-proxy in pod kube-proxy-f997t_kube-system(7652a1c4-9517-4a8a-a736-1f746f36c7ab)
3m30s Warning DNSConfigForming pod/kube-scheduler-master Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 10.8.0.1 192.168.18.1 fe80::1%3
kubectl -n kube-flannel get pods
NAME READY STATUS RESTARTS AGE
kube-flannel-ds-2xgbw 1/1 Running 0 8d
kube-flannel-ds-htgts 0/1 Init:CrashLoopBackOff 0 (2s ago) 88m
kube-flannel-ds-sznbq 1/1 Running 6 (36d ago) 48d
kubectl -n kube-flannel get events
LAST SEEN TYPE REASON OBJECT MESSAGE
100s Normal SandboxChanged pod/kube-flannel-ds-htgts Pod sandbox changed, it will be killed and re-created.
26m Normal Pulled pod/kube-flannel-ds-htgts Container image "docker.io/flannel/flannel-cni-plugin:v1.1.2" already present on machine
46m Warning BackOff pod/kube-flannel-ds-htgts Back-off restarting failed container install-cni-plugin in pod kube-flannel-ds-htgts_kube-flannel(4f602997-5502-4dcf-8fca-23eba01325dd)
5m Warning DNSConfigForming pod/kube-flannel-ds-sznbq Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 10.8.0.1 192.168.18.1 fe80::1%3
kubectl -n kube-flannel describe pod kube-flannel-ds-htgts
Name: kube-flannel-ds-htgts
Namespace: kube-flannel
Priority: 2000001000
Priority Class Name: system-node-critical
Service Account: flannel
Node: accio-test-1-b3fb4331ee/10.1.0.2
Start Time: Thu, 06 Apr 2023 09:25:12 +0900
Labels: app=flannel
controller-revision-hash=6b7b59d784
k8s-app=flannel
pod-template-generation=1
tier=node
Annotations: <none>
Status: Pending
IP: 10.1.0.2
IPs:
IP: 10.1.0.2
Controlled By: DaemonSet/kube-flannel-ds
Init Containers:
install-cni-plugin:
Container ID: docker://0fed30cc41f305203bf5d6fb7668f92f449a65f722faf1360e61231e9107ef66
Image: docker.io/flannel/flannel-cni-plugin:v1.1.2
Image ID: docker-pullable://flannel/flannel-cni-plugin@sha256:bf4b62b131666d040f35a327d906ee5a3418280b68a88d9b9c7e828057210443
Port: <none>
Host Port: <none>
Command:
cp
Args:
-f
/flannel
/opt/cni/bin/flannel
State: Terminated
Reason: Completed
Exit Code: 0
Started: Thu, 06 Apr 2023 15:11:34 +0900
Finished: Thu, 06 Apr 2023 15:11:34 +0900
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/opt/cni/bin from cni-plugin (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-gbk6z (ro)
install-cni:
Container ID:
Image: docker.io/flannel/flannel:v0.21.0
Image ID:
Port: <none>
Host Port: <none>
Command:
cp
Args:
-f
/etc/kube-flannel/cni-conf.json
/etc/cni/net.d/10-flannel.conflist
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Environment: <none>
Mounts:
/etc/cni/net.d from cni (rw)
/etc/kube-flannel/ from flannel-cfg (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-gbk6z (ro)
Containers:
kube-flannel:
Container ID:
Image: docker.io/flannel/flannel:v0.21.0
Image ID:
Port: <none>
Host Port: <none>
Command:
/opt/bin/flanneld
Args:
--ip-masq
--kube-subnet-mgr
--iface=accio-k8s-net
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Requests:
cpu: 100m
memory: 50Mi
Environment:
POD_NAME: kube-flannel-ds-htgts (v1:metadata.name)
POD_NAMESPACE: kube-flannel (v1:metadata.namespace)
KUBERNETES_SERVICE_HOST: 10.1.0.1
KUBERNETES_SERVICE_PORT: 6443
EVENT_QUEUE_DEPTH: 5000
Mounts:
/etc/kube-flannel/ from flannel-cfg (rw)
/run/flannel from run (rw)
/run/xtables.lock from xtables-lock (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-gbk6z (ro)
Conditions:
Type Status
Initialized False
Ready False
ContainersReady False
PodScheduled True
Volumes:
run:
Type: HostPath (bare host directory volume)
Path: /run/flannel
HostPathType:
cni-plugin:
Type: HostPath (bare host directory volume)
Path: /opt/cni/bin
HostPathType:
cni:
Type: HostPath (bare host directory volume)
Path: /etc/cni/net.d
HostPathType:
flannel-cfg:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: kube-flannel-cfg
Optional: false
xtables-lock:
Type: HostPath (bare host directory volume)
Path: /run/xtables.lock
HostPathType: FileOrCreate
kube-api-access-gbk6z:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: :NoSchedule op=Exists
node.kubernetes.io/disk-pressure:NoSchedule op=Exists
node.kubernetes.io/memory-pressure:NoSchedule op=Exists
node.kubernetes.io/network-unavailable:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists
node.kubernetes.io/pid-pressure:NoSchedule op=Exists
node.kubernetes.io/unreachable:NoExecute op=Exists
node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning BackOff 31m (x8482 over 5h46m) kubelet Back-off restarting failed container install-cni-plugin in pod kube-flannel-ds-htgts_kube-flannel(4f602997-5502-4dcf-8fca-23eba01325dd)
Normal Created 21m (x8783 over 5h46m) kubelet Created container install-cni-plugin
Normal Pulled 11m (x9051 over 5h46m) kubelet Container image "docker.io/flannel/flannel-cni-plugin:v1.1.2" already present on machine
Normal SandboxChanged 81s (x18656 over 5h46m) kubelet Pod sandbox changed, it will be killed and re-created.
由于容器运行时配置错误,我在我的一个节点中遇到了类似的问题。请检查位于/etc/containerd/config.toml
的containerd
配置,以指定守护进程级别选项。可以通过运行
containerd config default > /etc/containerd/config.toml
要在/etc/containerd/config.toml
和runc
中使用systemd cgroup驱动程序,设置
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
...
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = true
如果cgroup驱动程序不正确,则可能导致该节点中的pod始终在CrashLoopBackOff
中。
基于可能在启动或应用程序执行过程中很快发生的崩溃点,您可能并不总是看到日志。
如果日志没有显示,可以认为pod可能没有一些请求的资源可用。它可能是一个秘密或一个卷。
您可以通过执行以下命令获得有关资源和相关事件的完整详细信息,这有助于您快速了解和解决问题。
。kubectl get events
B。kubectl describe pod <pod_name>
C。kubectl get pods
检查下面的Crashloopbackoff的可能原因:
-
似乎内核不允许从非初始化网络设置一些连接字段。默认情况下,kube-proxy会尝试设置它们,因此会失败,并且pod会导致Crashloopbackoff。您可以配置kube-proxy来尝试在kubeadm中设置这些值。
。先删除本地集群
b。Set
sudo sysctl net/netfilter/nf_conntrack_max=131072
c。重新启动新的本地集群
-
Pods不能总是与clusterip通信。kube-proxy iptables默认为
masqueradeAll: false
。该值可能被错误地设置为true
。详细信息请参阅Github issue #2849。 -
检查集群的子网是否设置为与法兰绒清单的yaml文件中的子网不同,在这种情况下,您可以将法兰绒配置的yaml中的子网更改为与集群初始化时应用的子网相同。此外,您还可以参考Edgar Huynh在帖子Kube法兰绒中CrashLoopBackOff状态的回应,这可能有助于解决您的问题。
编辑:
基于您提供的Events:
检查节点上/etc/resolv.conf
中可能有太多的名称服务器,而不是在ClusterDNS配置中。要解决DNS配置问题,请参阅kubernetes官方文档中的DNS已知问题&k8的内核和法兰绒名称服务器限制超过,也参考Kubernetes社区论坛关于为什么etcd失败与Debian/bullseye内核的一般讨论?,这可能有助于解决您的问题。