我正在尝试在使用 kubeadm 安装 Kubernetes 后创建一个水平 pod 自动缩放。
主要症状是kubectl get hpa
将列TARGETS
中的 CPU 指标返回为"未定义":
$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
fibonacci Deployment/fibonacci <unknown> / 50% 1 3 1 1h
在进一步调查中,似乎hpa
正在尝试从 Heapster 接收 CPU 指标 - 但在我的配置中,cpu 指标由 cAdvisor 提供。
我根据kubectl describe hpa fibonacci
的输出做出这个假设:
Name: fibonacci
Namespace: default
Labels: <none>
Annotations: <none>
CreationTimestamp: Sun, 14 May 2017 18:08:53 +0000
Reference: Deployment/fibonacci
Metrics: ( current / target )
resource cpu on pods (as a percentage of request): <unknown> / 50%
Min replicas: 1
Max replicas: 3
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
1h 3s 148 horizontal-pod-autoscaler Warning FailedGetResourceMetric unable to get metrics for resource cpu: no metrics returned from heapster
1h 3s 148 horizontal-pod-autoscaler Warning FailedComputeMetricsReplicas failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from heapster
为什么hpa
尝试从堆而不是 cAdvisor 接收此指标?
我该如何解决这个问题?
请在下面找到我的部署,以及/var/log/container/kube-controller-manager.log
的内容以及kubectl get pods --namespace=kube-system
和kubectl describe pods
的输出
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: fibonacci
labels:
app: fibonacci
spec:
template:
metadata:
labels:
app: fibonacci
spec:
containers:
- name: fibonacci
image: oghma/fibonacci
ports:
- containerPort: 8088
resources:
requests:
memory: "64Mi"
cpu: "75m"
limits:
memory: "128Mi"
cpu: "100m"
---
kind: Service
apiVersion: v1
metadata:
name: fibonacci
spec:
selector:
app: fibonacci
ports:
- protocol: TCP
port: 8088
targetPort: 8088
externalIPs:
- 192.168.66.103
---
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: fibonacci
spec:
scaleTargetRef:
apiVersion: apps/v1beta1
kind: Deployment
name: fibonacci
minReplicas: 1
maxReplicas: 3
targetCPUUtilizationPercentage: 50
$ kubectl describe pods
Name: fibonacci-1503002127-3k755
Namespace: default
Node: kubernetesnode1/192.168.66.101
Start Time: Sun, 14 May 2017 17:47:08 +0000
Labels: app=fibonacci
pod-template-hash=1503002127
Annotations: kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"default","name":"fibonacci-1503002127","uid":"59ea64bb-38cd-11e7-b345-fa163edb1ca...
Status: Running
IP: 192.168.202.1
Controllers: ReplicaSet/fibonacci-1503002127
Containers:
fibonacci:
Container ID: docker://315375c6a978fd689f4ba61919c15f15035deb9139982844cefcd46092fbec14
Image: oghma/fibonacci
Image ID: docker://sha256:26f9b6b2c0073c766b472ec476fbcd2599969b6e5e7f564c3c0a03f8355ba9f6
Port: 8088/TCP
State: Running
Started: Sun, 14 May 2017 17:47:16 +0000
Ready: True
Restart Count: 0
Limits:
cpu: 100m
memory: 128Mi
Requests:
cpu: 75m
memory: 64Mi
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-45kp8 (ro)
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
Volumes:
default-token-45kp8:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-45kp8
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.alpha.kubernetes.io/notReady=:Exists:NoExecute for 300s
node.alpha.kubernetes.io/unreachable=:Exists:NoExecute for 300s
Events: <none>
$ kubectl get pods --namespace=kube-system
NAME READY STATUS RESTARTS AGE
calico-etcd-k1g53 1/1 Running 0 2h
calico-node-6n4gp 2/2 Running 1 2h
calico-node-nhmz7 2/2 Running 0 2h
calico-policy-controller-1324707180-65m78 1/1 Running 0 2h
etcd-kubernetesmaster 1/1 Running 0 2h
heapster-1428305041-zjzd1 1/1 Running 0 1h
kube-apiserver-kubernetesmaster 1/1 Running 0 2h
kube-controller-manager-kubernetesmaster 1/1 Running 0 2h
kube-dns-3913472980-gbg5h 3/3 Running 0 2h
kube-proxy-1dt3c 1/1 Running 0 2h
kube-proxy-tfhr9 1/1 Running 0 2h
kube-scheduler-kubernetesmaster 1/1 Running 0 2h
monitoring-grafana-3975459543-9q189 1/1 Running 0 1h
monitoring-influxdb-3480804314-7bvr3 1/1 Running 0 1h
$ cat /var/log/container/kube-controller-manager.log
"log":"I0514 17:47:08.631314 1 event.go:217] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"default", Name:"fibonacci", UID:"59e980d9-38cd-11e7-b345-fa163edb1ca6", APIVersion:"extensions", ResourceVersion:"1303", FieldPath:""}): type: 'Normal' reason: 'ScalingReplicaSet' Scaled up replica set fibonacci-1503002127 to 1n","stream":"stderr","time":"2017-05-14T17:47:08.63177467Z"}
{"log":"I0514 17:47:08.650662 1 event.go:217] Event(v1.ObjectReference{Kind:"ReplicaSet", Namespace:"default", Name:"fibonacci-1503002127", UID:"59ea64bb-38cd-11e7-b345-fa163edb1ca6", APIVersion:"extensions", ResourceVersion:"1304", FieldPath:""}): type: 'Normal' reason: 'SuccessfulCreate' Created pod: fibonacci-1503002127-3k755n","stream":"stderr","time":"2017-05-14T17:47:08.650826398Z"}
{"log":"E0514 17:49:00.873703 1 horizontal.go:201] failed to compute desired number of replicas based on listed metrics for Deployment/default/fibonacci: failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services http:heapster:)n","stream":"stderr","time":"2017-05-14T17:49:00.874034952Z"}
{"log":"E0514 17:49:30.884078 1 horizontal.go:201] failed to compute desired number of replicas based on listed metrics for Deployment/default/fibonacci: failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services http:heapster:)n","stream":"stderr","time":"2017-05-14T17:49:30.884546461Z"}
{"log":"E0514 17:50:00.896563 1 horizontal.go:201] failed to compute desired number of replicas based on listed metrics for Deployment/default/fibonacci: failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services http:heapster:)n","stream":"stderr","time":"2017-05-14T17:50:00.89688734Z"}
{"log":"E0514 17:50:30.906293 1 horizontal.go:201] failed to compute desired number of replicas based on listed metrics for Deployment/default/fibonacci: failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services http:heapster:)n","stream":"stderr","time":"2017-05-14T17:50:30.906825794Z"}
{"log":"E0514 17:51:00.915996 1 horizontal.go:201] failed to compute desired number of replicas based on listed metrics for Deployment/default/fibonacci: failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services http:heapster:)n","stream":"stderr","time":"2017-05-14T17:51:00.916348218Z"}
{"log":"E0514 17:51:30.926043 1 horizontal.go:201] failed to compute desired number of replicas based on listed metrics for Deployment/default/fibonacci: failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services http:heapster:)n","stream":"stderr","time":"2017-05-14T17:51:30.926367623Z"}
{"log":"E0514 17:52:00.936574 1 horizontal.go:201] failed to compute desired number of replicas based on listed metrics for Deployment/default/fibonacci: failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services http:heapster:)n","stream":"stderr","time":"2017-05-14T17:52:00.936903072Z"}
{"log":"E0514 17:52:30.944724 1 horizontal.go:201] failed to compute desired number of replicas based on listed metrics for Deployment/default/fibonacci: failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services http:heapster:)n","stream":"stderr","time":"2017-05-14T17:52:30.945120508Z"}
{"log":"E0514 17:53:00.954785 1 horizontal.go:201] failed to compute desired number of replicas based on listed metrics for Deployment/default/fibonacci: failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services http:heapster:)n","stream":"stderr","time":"2017-05-14T17:53:00.955126309Z"}
{"log":"E0514 17:53:30.970454 1 horizontal.go:201] failed to compute desired number of replicas based on listed metrics for Deployment/default/fibonacci: failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services http:heapster:)n","stream":"stderr","time":"2017-05-14T17:53:30.972996568Z"}
{"log":"E0514 17:54:00.980735 1 horizontal.go:201] failed to compute desired number of replicas based on listed metrics for Deployment/default/fibonacci: failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services http:heapster:)n","stream":"stderr","time":"2017-05-14T17:54:00.981098832Z"}
{"log":"E0514 17:54:30.993176 1 horizontal.go:201] failed to compute desired number of replicas based on listed metrics for Deployment/default/fibonacci: failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services http:heapster:)n","stream":"stderr","time":"2017-05-14T17:54:30.993538841Z"}
{"log":"E0514 17:55:01.002941 1 horizontal.go:201] failed to compute desired number of replicas based on listed metrics for Deployment/default/fibonacci: failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services http:heapster:)n","stream":"stderr","time":"2017-05-14T17:55:01.003265908Z"}
{"log":"W0514 17:55:06.511756 1 reflector.go:323] k8s.io/kubernetes/pkg/controller/garbagecollector/graph_builder.go:192: watch of u003cnilu003e ended with: etcdserver: mvcc: required revision has been compactedn","stream":"stderr","time":"2017-05-14T17:55:06.511957851Z"}
{"log":"E0514 17:55:31.013415 1 horizontal.go:201] failed to compute desired number of replicas based on listed metrics for Deployment/default/fibonacci: failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services http:heapster:)n","stream":"stderr","time":"2017-05-14T17:55:31.013776243Z"}
{"log":"E0514 17:56:01.024507 1 horizontal.go:201] failed to compute desired number of replicas based on listed metrics for Deployment/default/fibonacci: failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services http:heapster:)n","stream":"stderr","time":"2017-05-14T17:56:01.0248332Z"}
{"log":"E0514 17:56:31.036191 1 horizontal.go:201] failed to compute desired number of replicas based on listed metrics for Deployment/default/fibonacci: failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services http:heapster:)n","stream":"stderr","time":"2017-05-14T17:56:31.036606698Z"}
{"log":"E0514 17:57:01.049277 1 horizontal.go:201] failed to compute desired number of replicas based on listed metrics for Deployment/default/fibonacci: failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services http:heapster:)n","stream":"stderr","time":"2017-05-14T17:57:01.049616359Z"}
{"log":"E0514 17:57:31.064104 1 horizontal.go:201] failed to compute desired number of replicas based on listed metrics for Deployment/default/fibonacci: failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services http:heapster:)n","stream":"stderr","time":"2017-05-14T17:57:31.064489485Z"}
{"log":"E0514 17:58:01.073988 1 horizontal.go:201] failed to compute desired number of replicas based on listed metrics for Deployment/default/fibonacci: failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services http:heapster:)n","stream":"stderr","time":"2017-05-14T17:58:01.074339488Z"}
{"log":"E0514 17:58:31.084511 1 horizontal.go:201] failed to compute desired number of replicas based on listed metrics for Deployment/default/fibonacci: failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services http:heapster:)n","stream":"stderr","time":"2017-05-14T17:58:31.084839352Z"}
{"log":"E0514 17:59:01.096507 1 horizontal.go:201] failed to compute desired number of replicas based on listed metrics for Deployment/default/fibonacci: failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services http:heapster:)n","stream":"stderr","time":"2017-05-14T17:59:01.096896254Z"}
选项可以在群集池上启用自动缩放,请确保先将其打开。
然后应用您的 HPA,不要忘记设置 CPU、内存请求、K8s 控制器的限制
需要注意的一件事是,如果您的 Pod 上有多个容器,则应为每个容器指定 CPU、内存请求和限制
如果在部署中有多个容器,请确保已在所有这些容器中指定资源限制。
我也在其他应用程序中看到了这一点:HPA API中似乎有一个错误。
解决方案可以是改用复制控制器 scaleref:
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: fibonacci
namespace: ....
spec:
scaleRef:
kind: ReplicationController
name: fibonacci
subresource: scale
minReplicas: 1
maxReplicas: 3
targetCPUUtilizationPercentage: 50
未经测试,因此可能需要对scaleRef
进行一些编辑(您使用scaleTargetRef
(
您可以从部署中删除限制并尝试。在我的部署中,我只使用了资源请求,它起作用了。如果您看到水平吊舱自动缩放器(HPA(正在工作,那么稍后您也可以使用LIMITS。此讨论告诉您,仅使用 REQUESTS 就足以执行 HPA。
Tl;dr:如果您使用的是 AWS EKS 并且指定.spec.templates.spec.containers.<resources|limits>
不起作用,则问题可能是您没有安装 Kubernetes Metrics Server。
我在使用 AWS EKS 时遇到了 Kubernetes HPA 的问题。在寻找解决方案时,我遇到了以下命令,并决定运行它以查看我是否安装了指标服务器:
kubectl get pods -n kube-system
我没有安装它。事实证明,AWS有这个文档,指出默认情况下,EKS集群上没有安装指标服务器。所以我按照文档建议的步骤安装服务器:
- Deploy the Metrics Server with the following command:
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
- Verify that the metrics-server deployment is running the desired number of pods with the following command.
kubectl get deployment metrics-server -n kube-system
Output
NAME READY UP-TO-DATE AVAILABLE AGE
metrics-server 1/1 1
这就是我的解决方案。一旦指标服务器在我的集群上,我就成功地创建了能够获取有关其目标 Pod/资源的使用情况信息的 HPA。
PS:您也可以再次运行kubectl get pods -n kube-system
以确认安装。
PPS:HPA = 水平吊舱自动缩放程序
使用的是 GKE 1.9.x
有一些错误,需要先禁用自动缩放,然后重新启用它。这将提供当前值来代替未知值
尝试更新到最新的 GKE。
我遇到了类似的问题,希望这有帮助:
- 确保 HPA 的 ApiVersion 正确,因为语法会略有变化 版本与版本
- Do kubectl autoscale deploy -n --cpu-percent= --min= --max= --dry-run -o yaml
现在,这将为您提供根据群集的 ApiVersion 的 HPA 的确切语法。根据输出修改您的 helm hpa.yaml 文件,这应该可以解决问题。