Kubelet 停止发布节点状态和节点"k8sslave1"在 Kubernetes 中找不到 kubelet



我的本地机器kubernetes集群昨天运行良好,util我安装了一些组件,我的slave1和slave2每个只有4G,我检查可用内存只有100MB+,然后我停止VM并将KVM虚拟机内存增加到8GB。并重新检查可用内存,以确保每个节点都有2GB以上的可用内存。现在slave1和slave2节点运行不正常,这是节点状态:

[root@k8smaster ~]# kubectl get nodes -o wide
NAME        STATUS     ROLES    AGE   VERSION   INTERNAL-IP     EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION                CONTAINER-RUNTIME
k8smaster   Ready      master   12d   v1.18.5   192.168.31.29   <none>        CentOS Linux 8 (Core)   4.18.0-193.6.3.el8_2.x86_64   docker://19.3.12
k8sslave1   NotReady   <none>   12d   v1.18.5   192.168.31.30   <none>        CentOS Linux 8 (Core)   4.18.0-193.6.3.el8_2.x86_64   docker://19.3.12
k8sslave2   NotReady   <none>   12d   v1.18.5   192.168.31.31   <none>        CentOS Linux 8 (Core)   4.18.0-193.6.3.el8_2.x86_64   docker://19.3.12

然后我检查其中一个节点的状态,它看起来像这样:

[root@k8smaster ~]# kubectl describe node k8sslave1
Name:               k8sslave1
Roles:              <none>
Labels:             beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/arch=amd64
kubernetes.io/hostname=k8sslave1
kubernetes.io/os=linux
Annotations:        kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
node.alpha.kubernetes.io/ttl: 0
projectcalico.org/IPv4Address: 192.168.31.30/24
projectcalico.org/IPv4IPIPTunnelAddr: 10.11.157.64
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Mon, 13 Jul 2020 11:50:48 -0400
Taints:             node.kubernetes.io/unreachable:NoSchedule
Unschedulable:      false
Lease:
HolderIdentity:  k8sslave1
AcquireTime:     <unset>
RenewTime:       Sat, 25 Jul 2020 09:47:24 -0400
Conditions:
Type                 Status    LastHeartbeatTime                 LastTransitionTime                Reason              Message
----                 ------    -----------------                 ------------------                ------              -------
NetworkUnavailable   False     Sat, 25 Jul 2020 00:48:55 -0400   Sat, 25 Jul 2020 00:48:55 -0400   CalicoIsUp          Calico is running on this node
MemoryPressure       Unknown   Sat, 25 Jul 2020 09:43:45 -0400   Sat, 25 Jul 2020 09:48:07 -0400   NodeStatusUnknown   Kubelet stopped posting node status.
DiskPressure         Unknown   Sat, 25 Jul 2020 09:43:45 -0400   Sat, 25 Jul 2020 09:48:07 -0400   NodeStatusUnknown   Kubelet stopped posting node status.
PIDPressure          Unknown   Sat, 25 Jul 2020 09:43:45 -0400   Sat, 25 Jul 2020 09:48:07 -0400   NodeStatusUnknown   Kubelet stopped posting node status.
Ready                Unknown   Sat, 25 Jul 2020 09:43:45 -0400   Sat, 25 Jul 2020 09:48:07 -0400   NodeStatusUnknown   Kubelet stopped posting node status.
Addresses:
InternalIP:  192.168.31.30
Hostname:    k8sslave1
Capacity:
cpu:                2
ephemeral-storage:  36702712Ki
hugepages-1Gi:      0
hugepages-2Mi:      0
memory:             4311228Ki
pods:               110
Allocatable:
cpu:                2
ephemeral-storage:  33825219324
hugepages-1Gi:      0
hugepages-2Mi:      0
memory:             4208828Ki
pods:               110
System Info:
Machine ID:                 0c9c1291618645498e63ddfe3895658a
System UUID:                b25d27cf-4dea-44fe-96d8-a75e0c138187
Boot ID:                    3290a714-0e18-47dd-a811-2dd16c8a17c9
Kernel Version:             4.18.0-193.6.3.el8_2.x86_64
OS Image:                   CentOS Linux 8 (Core)
Operating System:           linux
Architecture:               amd64
Container Runtime Version:  docker://19.3.12
Kubelet Version:            v1.18.5
Kube-Proxy Version:         v1.18.5
PodCIDR:                      10.11.1.0/24
PodCIDRs:                     10.11.1.0/24
Non-terminated Pods:          (16 in total)
Namespace                   Name                                                              CPU Requests  CPU Limits  Memory Requests  Memory Limits  AGE
---------                   ----                                                              ------------  ----------  ---------------  -------------  ---
default                     apm-server-filebeat-dn48j                                         100m (5%)     1 (50%)     100Mi (2%)       200Mi (4%)     15h
default                     traefik-88f7c94bf-tdz2m                                           0 (0%)        0 (0%)      0 (0%)           0 (0%)         8d
infrastructure              elasticsearch-elasticsearch-coordinating-only-7744945d6d-zwz7z    25m (1%)      0 (0%)      256Mi (6%)       0 (0%)         15h
infrastructure              harbor-harbor-chartmuseum-575cdf84f6-2m5t5                        0 (0%)        0 (0%)      0 (0%)           0 (0%)         3d14h
infrastructure              harbor-harbor-clair-6464c85c99-zb997                              0 (0%)        0 (0%)      0 (0%)           0 (0%)         35h
infrastructure              harbor-harbor-notary-signer-5d9b779f54-fwzl8                      0 (0%)        0 (0%)      0 (0%)           0 (0%)         36h
infrastructure              harbor-harbor-portal-59c779dd74-lj5zl                             0 (0%)        0 (0%)      0 (0%)           0 (0%)         3d14h
infrastructure              harbor-harbor-registry-6ffb84b667-cvxwq                           0 (0%)        0 (0%)      0 (0%)           0 (0%)         3d14h
infrastructure              harbor-harbor-trivy-0                                             200m (10%)    1 (50%)     512Mi (12%)      1Gi (24%)      36h
infrastructure              jenkins-845bd5bcd4-4mkqn                                          50m (2%)      2 (100%)    256Mi (6%)       4Gi (99%)      4d14h
kube-system                 calico-kube-controllers-75d555c48-wd84b                           0 (0%)        0 (0%)      0 (0%)           0 (0%)         12d
kube-system                 calico-node-2sj6v                                                 250m (12%)    0 (0%)      0 (0%)           0 (0%)         12d
kube-system                 coredns-676d976fcb-bxzcs                                          100m (5%)     0 (0%)      70Mi (1%)        170Mi (4%)     9d
kube-system                 kube-proxy-f4lg4                                                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         12d
monitoring                  prometheus-1595085197-node-exporter-ztgd4                         0 (0%)        0 (0%)      0 (0%)           0 (0%)         7d13h
monitoring                  prometheus-1595085197-server-57967bb676-ksl2k                     0 (0%)        0 (0%)      0 (0%)           0 (0%)         7d13h
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource           Requests      Limits
--------           --------      ------
cpu                725m (36%)    4 (200%)
memory             1194Mi (29%)  5490Mi (133%)
ephemeral-storage  0 (0%)        0 (0%)
hugepages-1Gi      0 (0%)        0 (0%)
hugepages-2Mi      0 (0%)        0 (0%)
Events:              <none>

它告诉我kubelet没有发送状态信息,然后我检查slave1节点kubelet状态:

[root@k8sslave1 ~]# systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Drop-In: /usr/lib/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: active (running) since Sun 2020-07-26 00:30:45 EDT; 8min ago
Docs: https://kubernetes.io/docs/
Main PID: 7192 (kubelet)
Tasks: 17 (limit: 49628)
Memory: 41.5M
CGroup: /system.slice/kubelet.service
└─7192 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cgroup-driver=systemd --network-plugin=cni --pod-infra-container-image>
Jul 26 00:39:31 k8sslave1 kubelet[7192]: E0726 00:39:31.625224    7192 kubelet.go:2267] node "k8sslave1" not found
Jul 26 00:39:31 k8sslave1 kubelet[7192]: E0726 00:39:31.725575    7192 kubelet.go:2267] node "k8sslave1" not found
Jul 26 00:39:31 k8sslave1 kubelet[7192]: E0726 00:39:31.825956    7192 kubelet.go:2267] node "k8sslave1" not found
Jul 26 00:39:31 k8sslave1 kubelet[7192]: E0726 00:39:31.929822    7192 kubelet.go:2267] node "k8sslave1" not found
Jul 26 00:39:32 k8sslave1 kubelet[7192]: E0726 00:39:32.030028    7192 kubelet.go:2267] node "k8sslave1" not found
Jul 26 00:39:32 k8sslave1 kubelet[7192]: E0726 00:39:32.130344    7192 kubelet.go:2267] node "k8sslave1" not found
Jul 26 00:39:32 k8sslave1 kubelet[7192]: E0726 00:39:32.230562    7192 kubelet.go:2267] node "k8sslave1" not found
Jul 26 00:39:32 k8sslave1 kubelet[7192]: E0726 00:39:32.330896    7192 kubelet.go:2267] node "k8sslave1" not found
Jul 26 00:39:32 k8sslave1 kubelet[7192]: E0726 00:39:32.431111    7192 kubelet.go:2267] node "k8sslave1" not found
Jul 26 00:39:32 k8sslave1 kubelet[7192]: E0726 00:39:32.531472    7192 kubelet.go:2267] node "k8sslave1" not found

进程运行良好,但给我提示:node "k8sslave1" not found。为什么要给我这些提示?我该怎么修?

您在使用kubeadm吗?如果您正在使用kubeadm;您可以遵循以下步骤:

  1. 删除从属节点

    kubecl delete node k8sslave1

  2. 从从属节点执行:

    kubeadm reset

  3. 然后你需要将从节点加入集群,在主节点中执行:

    token=$(kubeadm token generate)

    kubeadm token create $token --ttl 2h --print-join-command

  4. 将命令的输出粘贴到从属节点中。

    kubectl join ...

  5. 检查节点是否已加入集群,并且新状态为Ready

    ubuntu@kube-master:~$ kubectl get nodes

    NAME            STATUS   ROLES    AGE   VERSION
    kube-master     Ready    master   20d   v1.18.1
    kube-worker-1   Ready    <none>   20d   v1.18.1
    kube-worker-2   Ready    <none>   12m   v1.18.1
    

我希望它对你有效。:(

最新更新