指标服务器当前无法处理请求



我是 kubernetes 的新手,正在尝试将水平 pod 自动缩放应用于我现有的应用程序。 在遵循其他 stackoverflow 详细信息之后 - 知道我需要安装指标服务器 - 我能够 - 但有些它不起作用并且无法处理请求。

  • 此外,我遵循了更多的事情,但无法解决问题 - 我将非常感谢这里的任何帮助。 请让我知道您需要帮助我:)的任何更多详细信息提前谢谢。

遵循的步骤:

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

serviceaccount/metrics-server created
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrole.rbac.authorization.k8s.io/system:metrics-server created
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created
service/metrics-server created
deployment.apps/metrics-server created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created

kubectl get deploy,svc -n kube-system | egrep metrics-server

deployment.apps/metrics-server   1/1     1            1           2m6s
service/metrics-server                       ClusterIP   10.32.0.32   <none>        443/TCP                        2m6s

kubectl get pods -n kube-system | grep metrics-server

metrics-server-64cf6869bd-6gx88   1/1     Running   0          2m39s

vi ana_hpa.yaml

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: ana-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: StatefulSet
name: common-services-auth
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 80
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 160

k apply -f ana_hpa.yaml

Horizontalpodautoscaler.autoscaling/ana-hpa create

K 获得 HPA

NAME      REFERENCE                          TARGETS                         MINPODS   MAXPODS   REPLICAS   AGE
ana-hpa   StatefulSet/common-services-auth   <unknown>/160%, <unknown>/80%   1         10        0          4s

K 描述 HPA ANA-HPA

Name:                                                     ana-hpa
Namespace:                                                default
Labels:                                                   <none>
Annotations:                                              <none>
CreationTimestamp:                                        Tue, 12 Apr 2022 17:01:25 +0530
Reference:                                                StatefulSet/common-services-auth
Metrics:                                                  ( current / target )
resource memory on pods  (as a percentage of request):  <unknown> / 160%
resource cpu on pods  (as a percentage of request):     <unknown> / 80%
Min replicas:                                             1
Max replicas:                                             10
StatefulSet pods:                                         3 current / 0 desired
Conditions:
Type           Status  Reason                   Message
----           ------  ------                   -------
AbleToScale    True    SucceededGetScale        the HPA controller was able to get the target's current scale
ScalingActive  False   FailedGetResourceMetric  the HPA was unable to compute the replica count: failed to get memory utilization: unable to get metrics for resource memory: unable to fetch metrics from resource metrics API: the server is currently unable to handle the request (get pods.metrics.k8s.io)
Events:
Type     Reason                        Age                  From                       Message
----     ------                        ----                 ----                       -------
Warning  FailedGetResourceMetric       38s (x8 over 2m23s)  horizontal-pod-autoscaler  failed to get cpu utilization: unable to get metrics for resource cpu: unable to fetch metrics from resource metrics API: the server is currently unable to handle the request (get pods.metrics.k8s.io)
Warning  FailedComputeMetricsReplicas  38s (x8 over 2m23s)  horizontal-pod-autoscaler  invalid metrics (2 invalid out of 2), first error is: failed to get memory utilization: unable to get metrics for resource memory: unable to fetch metrics from resource metrics API: the server is currently unable to handle the request (get pods.metrics.k8s.io)
Warning  FailedGetResourceMetric       23s (x9 over 2m23s)  horizontal-pod-autoscaler  failed to get memory utilization: unable to get metrics for resource memory: unable to fetch metrics from resource metrics API: the server is currently unable to handle the request (get pods.metrics.k8s.io)


kubectl get --raw/apis/metrics.k8s.io/v1beta1

Error from server (ServiceUnavailable): the server is currently unable to handle the request

kubectl get --raw "/apis/metrics.k8s.io/v1beta1/nodes">

Error from server (ServiceUnavailable): the server is currently unable to handle the request

kubectl edit deployments.apps -n kube-system metrics-server

添加主机网络:真

deployment.apps/metrics-server edited

kubectl get pods -n kube-system | grep metrics-server

指标-服务器-5dc6dbdb8-42hw9 1/1 跑步 0 10m

k describe pod metrics-server-5dc6dbdb8-42hw9 -n kube-system

Name:                 metrics-server-5dc6dbdb8-42hw9
Namespace:            kube-system
Priority:             2000000000
Priority Class Name:  system-cluster-critical
Node:                 pusntyn196.apac.avaya.com/10.133.85.196
Start Time:           Tue, 12 Apr 2022 17:08:25 +0530
Labels:               k8s-app=metrics-server
pod-template-hash=5dc6dbdb8
Annotations:          <none>
Status:               Running
IP:                   10.133.85.196
IPs:
IP:           10.133.85.196
Controlled By:  ReplicaSet/metrics-server-5dc6dbdb8
Containers:
metrics-server:
Container ID:  containerd://024afb1998dce4c0bd5f4e58f996068ea37982bd501b54fda2ef8d5c1098b4f4
Image:         k8s.gcr.io/metrics-server/metrics-server:v0.6.1
Image ID:      k8s.gcr.io/metrics-server/metrics-server@sha256:5ddc6458eb95f5c70bd13fdab90cbd7d6ad1066e5b528ad1dcb28b76c5fb2f00
Port:          4443/TCP
Host Port:     4443/TCP
Args:
--cert-dir=/tmp
--secure-port=4443
--kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
--kubelet-use-node-status-port
--metric-resolution=15s
State:          Running
Started:      Tue, 12 Apr 2022 17:08:26 +0530
Ready:          True
Restart Count:  0
Requests:
cpu:        100m
memory:     200Mi
Liveness:     http-get https://:https/livez delay=0s timeout=1s period=10s #success=1 #failure=3
Readiness:    http-get https://:https/readyz delay=20s timeout=1s period=10s #success=1 #failure=3
Environment:  <none>
Mounts:
/tmp from tmp-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-g6p4g (ro)
Conditions:
Type              Status
Initialized       True
Ready             True
ContainersReady   True
PodScheduled      True
Volumes:
tmp-dir:
Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit:  <unset>
kube-api-access-g6p4g:
Type:                    Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds:  3607
ConfigMapName:           kube-root-ca.crt
ConfigMapOptional:       <nil>
DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              kubernetes.io/os=linux
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 2s
node.kubernetes.io/unreachable:NoExecute op=Exists for 2s
Events:
Type    Reason     Age    From               Message
----    ------     ----   ----               -------
Normal  Scheduled  2m31s  default-scheduler  Successfully assigned kube-system/metrics-server-5dc6dbdb8-42hw9 to pusntyn196.apac.avaya.com
Normal  Pulled     2m32s  kubelet            Container image "k8s.gcr.io/metrics-server/metrics-server:v0.6.1" already present on machine
Normal  Created    2m31s  kubelet            Created container metrics-server
Normal  Started    2m31s  kubelet            Started container metrics-server

kubectl get --raw/apis/metrics.k8s.io/v1beta1

Error from server (ServiceUnavailable): the server is currently unable to handle the request

kubectl get pods -n kube-system | grep metrics-server

metrics-server-5dc6dbdb8-42hw9   1/1     Running   0          10m

kubectl logs -f metrics-server-5dc6dbdb8-42hw9 -n kube-system

E0412 11:43:54.684784       1 configmap_cafile_content.go:242] kube-system/extension-apiserver-authentication failed with : missing content for CA bundle "client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
E0412 11:44:27.001010       1 configmap_cafile_content.go:242] key failed with : missing content for CA bundle "client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
k logs -f metrics-server-5dc6dbdb8-42hw9 -n kube-system
I0412 11:38:26.447305       1 serving.go:342] Generated self-signed cert (/tmp/apiserver.crt, /tmp/apiserver.key)
I0412 11:38:26.899459       1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController
I0412 11:38:26.899477       1 shared_informer.go:240] Waiting for caches to sync for RequestHeaderAuthRequestController
I0412 11:38:26.899518       1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
I0412 11:38:26.899545       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0412 11:38:26.899546       1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
I0412 11:38:26.899567       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0412 11:38:26.900480       1 dynamic_serving_content.go:131] "Starting controller" name="serving-cert::/tmp/apiserver.crt::/tmp/apiserver.key"
I0412 11:38:26.900811       1 secure_serving.go:266] Serving securely on [::]:4443
I0412 11:38:26.900854       1 tlsconfig.go:240] "Starting DynamicServingCertificateController"
W0412 11:38:26.900965       1 shared_informer.go:372] The sharedIndexInformer has started, run more than once is not allowed
I0412 11:38:26.999960       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0412 11:38:26.999989       1 shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestController
I0412 11:38:26.999970       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
E0412 11:38:27.000087       1 configmap_cafile_content.go:242] kube-system/extension-apiserver-authentication failed with : missing content for CA bundle "client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
E0412 11:38:27.000118       1 configmap_cafile_content.go:242] key failed with : missing content for CA bundle "client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"

kubectl 顶级节点

Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io)

kubectl top pods

Error from server (ServiceUnavailable): the server is currently unable to handle the request (get pods.metrics.k8s.io)

编辑指标服务器部署 yaml

Add - --kubelet-insecure-tls

k apply -f metric-server-deployment.yaml

serviceaccount/metrics-server unchanged
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader unchanged
clusterrole.rbac.authorization.k8s.io/system:metrics-server unchanged
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader unchanged
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator unchanged
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server unchanged
service/metrics-server unchanged
deployment.apps/metrics-server configured
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io unchanged

kubectl get pods -n kube-system | grep metrics-server

metrics-server-5dc6dbdb8-42hw9   1/1     Running   0          10m

kubectl top pods

Error from server (ServiceUnavailable): the server is currently unable to handle the request (get pods.metrics.k8s.io)

还尝试通过添加以下内容到指标服务器部署

command:
- /metrics-server
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP

首先,执行以下命令:

kubectl get apiservices

并查看kube-system/metrics-server服务的可用性(状态)。

  • 如果可用性为True: 通过执行以下命令,将hostNetwork: true添加到指标服务器部署的spec

    kubectl edit deployment -n kube-system metrics-server
    

    它应如下所示:

    ...
    spec:
    hostNetwork: true
    ...
    

    将主机网络设置为 true 意味着 Pod 将有权访问 运行它的主机。

  • 如果可用性为False(缺少端点):

    1. 下载指标服务器:

      wget https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.5.0/components.yaml
      
    2. 删除(旧版)指标服务器:

      kubectl delete -f components.yaml  
      
    3. 编辑下载的文件并将- --kubelet-insecure-tls添加到参数列表:

      ...
      labels:
      k8s-app: metrics-server
      spec:
      containers:
      - args:
      - --cert-dir=/tmp
      - --secure-port=443
      - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
      - --kubelet-use-node-status-port
      - --metric-resolution=15s
      - --kubelet-insecure-tls # add this line
      ...
      
    4. 再次创建服务:

      kubectl apply -f components.yaml
      

这可以通过编辑部署 yaml 文件并在dnsPolicy: ClusterFirst后添加hostNetwork: true来轻松解决

kubectl edit deployments.apps -n kube-system metrics-server

插入:

hostNetwork: true

对我来说,在带有 helmfile 的 EKS 上,我必须使用指标服务器图表在values.yaml中编写:

containerPort: 10250

在我首次部署图表时,默认情况下,由于未知原因,该值强制为4443

请参阅文档:

  • https://github.com/kubernetes-sigs/metrics-server/blob/master/charts/metrics-server/values.yaml#L62
  • https://aws.amazon.com/premiumsupport/knowledge-center/eks-metrics-server/#:~:text=confirm%20that%20your%20security%20groups

然后kubectl top nodeskubectl describe apiservice v1beta1.metrics.k8s.io开始工作。

我希望这对裸机集群有所帮助:

$ helm --repo https://kubernetes-sigs.github.io/metrics-server/ --kubeconfig=$HOME/.kube/loc-cluster.config -n kube-system --set args='{--kubelet-insecure-tls}' upgrade --install metrics-server metrics-server
$ helm --kubeconfig=$HOME/.kube/loc-cluster.config -n kube-system uninstall metrics-server

更新:我使用相同的命令部署了metrics-server。也许您可以通过删除现有资源并运行以下命令来重新开始:

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

====

===========================================================================部署中的容器模板似乎未正确配置--kubelet-insecure-tls标志。以下方法应该可以解决此问题:

  1. 使用kubectl edit deployment/metrics-server -nkube-system编辑群集中的现有部署。
  2. 将标志添加到spec.containers[].args列表中,以便部署如下所示:
...
spec:
containers:
- args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --kubelet-use-node-status-port
- --metric-resolution=15s
- --kubelet-insecure-tls      <=======ADD IT HERE.
image: k8s.gcr.io/metrics-server/metrics-server:v0.6.1
...
  1. 只需保存更改,然后让部署推出更新的 Pod。您可以使用watch -n1 kubectl get deployment/kube-metrics -nkube-system并等待UP-TO-DATE列显示1

喜欢这个:

NAME             READY   UP-TO-DATE   AVAILABLE   AGE
metrics-server   1/1     1            1           16m
  1. kubectl top nodes验证。它将显示类似的东西
NAME             CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
docker-desktop   222m         5%     1600Mi          41%

我刚刚验证了这一点以在本地设置上工作。让我知道这是否有帮助:)

请正确仔细地配置聚合层,您可以使用此链接获取帮助: https://kubernetes.io/docs/tasks/extend-kubernetes/configure-aggregation-layer/.

apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
name: <name of the registration object>
spec:
group: <API group name this extension apiserver hosts>
version: <API version this extension apiserver hosts>
groupPriorityMinimum: <priority this APIService for this group, see API documentation>
versionPriority: <prioritizes ordering of this version within a group, see API documentation>
service:
namespace: <namespace of the extension apiserver service>
name: <name of the extension apiserver service>
caBundle: <pem encoded ca cert that signs the server cert used by the webhook> 

提供kubectl version返回值会很有帮助。

对于我们在Google CloudGKE私有集群上,我们必须添加防火墙规则以允许此流量。

此处的用例是"聚合 API 服务器",专用群集文档中介绍了如何添加正确的防火墙规则的过程。

对我们来说,我们必须允许端口从kubectl describe apiservice v1beta1.custom.metrics.k8s.io中的失败消息中6443

Status:
Conditions:
Last Transition Time:  2023-03-07T11:17:17Z
Message:               failing or missing response from https://10.4.1.14:6443 [...]

对我来说,这发生在我的本地 k3s 集群上,并发现重新启动 k3s 服务会在 5 到 10 分钟后解决它。

sudo systemctl restart k3s

相关内容

  • 没有找到相关文章

最新更新