coredns正在运行,但在变出k8s cdk后还没有准备好



我已经使用变魔术(使用仿生(部署了KubernetesV1.18.2(CDK(更新:完全销毁了上述env,然后在此处使用CDK捆绑包手动重新部署https://jaas.ai/canonical-kubernetes,相同的K8S版本相同的操作系统版本(Ubuntu 18.04(没有区别。

coredns通过/etc/resolv.conf进行解析,参见下面的configmap

Name:         coredns
Namespace:    kube-system
Labels:       cdk-addons=true
Annotations:  
Data
====
Corefile:
----
.:53 {
errors
health {
lameduck 5s
}
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
fallthrough in-addr.arpa ip6.arpa
}
prometheus :9153
forward . /etc/resolv.conf
cache 30
loop
reload
loadbalance
}
Events:  <none>

这里有一个已知的问题https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/#known-关于/etc/resolv.conf代替/run/systemd/resolve/resolv.conf的问题

我编辑了coredns配置映射,将其指向/run/systemd/resolve/resolv.conf,但设置被还原。

我还尝试将kubelet-extra-config设置为{resolvConf: /run/systemd/resolve/resolv.conf},重新启动服务器,没有更改:

kubelet-extra-config:
default: '{}'
description: |
Extra configuration to be passed to kubelet. Any values specified in this
config will be merged into a KubeletConfiguration file that is passed to
the kubelet service via the --config flag. This can be used to override
values provided by the charm.
Requires Kubernetes 1.10+.
The value for this config must be a YAML mapping that can be safely
merged with a KubeletConfiguration file. For example:
{evictionHard: {memory.available: 200Mi}}
For more information about KubeletConfiguration, see upstream docs:
https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/
source: user
type: string
value: '{resolvConf: /run/systemd/resolve/resolv.conf}'

但在检查配置时,我可以看到kubelet配置中的更改https://kubernetes.io/docs/tasks/administer-cluster/reconfigure-kubelet/

...
"resolvConf": "/run/systemd/resolve/resolv.conf",
...

这是我在coredns pod中得到的错误:

E0429 09:16:42.172959       1 reflector.go:153] pkg/mod/k8s.io/client-go@v0.17.2/tools/cache/reflector.go:105: Failed to list *v1.Endpoints: Get https://10.152.183.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.152.183.1:443: i/o timeout
[INFO] plugin/ready: Still waiting on: "kubernetes"

参见kubernetes服务:

default                           kubernetes                               ClusterIP   10.152.183.1     <none>        443/TCP                  4h42m   <none>

这里是coredns部署:

Name:                   coredns
Namespace:              kube-system
CreationTimestamp:      Wed, 29 Apr 2020 09:15:07 +0000
Labels:                 cdk-addons=true
cdk-restart-on-ca-change=true
k8s-app=kube-dns
kubernetes.io/name=CoreDNS
Annotations:            deployment.kubernetes.io/revision: 1
Selector:               k8s-app=kube-dns
Replicas:               1 desired | 1 updated | 1 total | 0 available | 1 unavailable
StrategyType:           RollingUpdate
MinReadySeconds:        0
RollingUpdateStrategy:  1 max unavailable, 25% max surge
Pod Template:
Labels:           k8s-app=kube-dns
Service Account:  coredns
Containers:
coredns:
Image:       rocks.canonical.com:443/cdk/coredns/coredns-amd64:1.6.7
Ports:       53/UDP, 53/TCP, 9153/TCP
Host Ports:  0/UDP, 0/TCP, 0/TCP
Args:
-conf
/etc/coredns/Corefile
Limits:
memory:  170Mi
Requests:
cpu:        100m
memory:     70Mi
Liveness:     http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
Readiness:    http-get http://:8181/ready delay=0s timeout=1s period=10s #success=1 #failure=3
Environment:  <none>
Mounts:
/etc/coredns from config-volume (ro)
Volumes:
config-volume:
Type:               ConfigMap (a volume populated by a ConfigMap)
Name:               coredns
Optional:           false
Priority Class Name:  system-cluster-critical
Conditions:
Type           Status  Reason
----           ------  ------
Available      True    MinimumReplicasAvailable
Progressing    False   ProgressDeadlineExceeded
OldReplicaSets:  <none>
NewReplicaSet:   coredns-6b59b8bd9f (1/1 replicas created)
Events:
Type    Reason             Age   From                   Message
----    ------             ----  ----                   -------
Normal  ScalingReplicaSet  11m   deployment-controller  Scaled up replica set coredns-6b59b8bd9f to 1

有人能帮忙吗?

更多信息:K8S SVC配置正确:

Name:              kubernetes
Namespace:         default
Labels:            component=apiserver
provider=kubernetes
Annotations:       <none>
Selector:          <none>
Type:              ClusterIP
IP:                10.152.183.1
Port:              https  443/TCP
TargetPort:        6443/TCP
Endpoints:         xx.xx.xx.xx:6443,xx.xx.xx.yy:6443
Session Affinity:  None
Events:            <none>

我可以用不安全的卷曲两个IP地址

描述EP:

kubectl describe ep kubernetes 
Name:         kubernetes
Namespace:    default
Labels:       <none>
Annotations:  <none>
Subsets:
Addresses:          xx.xx.xx.xx,xx.xx.xx.yy
NotReadyAddresses:  <none>
Ports:
Name   Port  Protocol
----   ----  --------
https  6443  TCP
Events:  <none>

更多发现:juju在CDK部署期间创建的大多数vnet似乎都没有运行。我怀疑这是因为apparmor(基于https://jaas.ai/canonical-kubernetes/bundle/21(

注意:如果您希望在笔记本电脑上本地部署此捆绑包,请参阅关于交替部署方法下的联合部分。默认部署via juju不会正确调整apparmor配置文件以支持跑步kubernetes在LXD。此时,这是一个必要的中间部署机械装置

7: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br1 state UNKNOWN group default qlen 1000
link/ether fe:54:00:f0:0c:29 brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc54:ff:fef0:c29/64 scope link 
valid_lft forever preferred_lft forever
70: vnet12: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br1 state UNKNOWN group default qlen 1000
link/ether fe:54:00:00:a3:94 brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc54:ff:fe00:a394/64 scope link 
valid_lft forever preferred_lft forever
72: vnet13: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br1 state UNKNOWN group default qlen 1000
link/ether fe:54:00:15:17:f4 brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc54:ff:fe15:17f4/64 scope link 
valid_lft forever preferred_lft forever
74: vnet14: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br1 state UNKNOWN group default qlen 1000
link/ether fe:54:00:ec:5c:72 brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc54:ff:feec:5c72/64 scope link 
valid_lft forever preferred_lft forever
76: vnet15: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br1 state UNKNOWN group default qlen 1000
link/ether fe:54:00:60:79:18 brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc54:ff:fe60:7918/64 scope link 
valid_lft forever preferred_lft forever
79: vnet16: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br1 state UNKNOWN group default qlen 1000
link/ether fe:54:00:67:ff:14 brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc54:ff:fe67:ff14/64 scope link 
valid_lft forever preferred_lft forever
81: vnet17: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br1 state UNKNOWN group default qlen 1000
link/ether fe:54:00:96:71:01 brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc54:ff:fe96:7101/64 scope link 
valid_lft forever preferred_lft forever
83: vnet18: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br1 state UNKNOWN group default qlen 1000
link/ether fe:54:00:a8:1d:b7 brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc54:ff:fea8:1db7/64 scope link 
valid_lft forever preferred_lft forever
85: vnet19: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br1 state UNKNOWN group default qlen 1000
link/ether fe:54:00:2a:89:c1 brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc54:ff:fe2a:89c1/64 scope link 
valid_lft forever preferred_lft forever
87: vnet20: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br1 state UNKNOWN group default qlen 1000
link/ether fe:54:00:4e:ce:fb brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc54:ff:fe4e:cefb/64 scope link 
valid_lft forever preferred_lft forever
89: vnet21: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br1 state UNKNOWN group default qlen 1000
link/ether fe:54:00:93:55:ac brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc54:ff:fe93:55ac/64 scope link 
valid_lft forever preferred_lft forever
90: vnet22: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br1 state UNKNOWN group default qlen 1000
link/ether fe:54:00:b7:ae:b2 brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc54:ff:feb7:aeb2/64 scope link 
valid_lft forever preferred_lft forever

另一个新更新:我尝试了xenial部署,注意到/etc/resolv.conf配置正确,没有任何问题,但问题仍然是

事实证明flannel与我的本地网络冲突,在部署之前在juju的bundle.yaml中指定了以下内容:

applications:
flannel:
options:
cidr: 10.2.0.0/16

一劳永逸地解决了问题!:(

相关内容

  • 没有找到相关文章

最新更新