Kubernetes HPA指标错误

我在Google Cloud上创建了一个GKE测试集群。它有3个节点，带有2个vCPU/8 GB RAM。我在上部署了两个java应用程序

这是yaml文件：

apiVersion: apps/v1            
kind: Deployment
metadata:                    
name: myapi           
spec:
selector:                                                                          
matchLabels:
app: myapi
strategy:
type: Recreate
template:
metadata:
labels:
app: myapi
spec:
containers:
- image: eu.gcr.io/myproject/my-api:latest
name: myapi
imagePullPolicy: Always
ports:
- containerPort: 8080
name: myapi
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: myfrontend
spec:
selector:
matchLabels:
app: myfrontend
strategy:
type: Recreate
template:
metadata:
labels:
app: myfrontend
spec:
containers:
- image: eu.gcr.io/myproject/my-frontend:latest
name: myfrontend
imagePullPolicy: Always
ports:
- containerPort: 8080
name: myfrontend
---

然后我想添加一个HPA，其中包含以下详细信息：

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: myfrontend
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: myfrontend
minReplicas: 2
maxReplicas: 5
targetCPUUtilizationPercentage: 50
---
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: myapi
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: myapi
minReplicas: 2
maxReplicas: 4
targetCPUUtilizationPercentage: 80
---

如果我检查kubectl顶部吊舱，它会显示一些非常奇怪的指标：

NAME                         CPU(cores)   MEMORY(bytes)   
myapi-6fcdb94fd9-m5sh7      194m         1074Mi          
myapi-6fcdb94fd9-sptbb      193m         1066Mi          
myapi-6fcdb94fd9-x6kmf      200m         1108Mi          
myapi-6fcdb94fd9-zzwmq      203m         1074Mi          
myfrontend-788d48f456-7hxvd   0m           111Mi           
myfrontend-788d48f456-hlfrn   0m           113Mi

HPA信息：

NAME        REFERENCE              TARGETS    MINPODS   MAXPODS   REPLICAS   AGE
myapi      Deployment/myapi      196%/80%   2         4         4          32m
myfrontend   Deployment/myfrontend   0%/50%     2         5         2          32m

但是，如果我检查其中一个节点的正常运行时间，它会显示一个不那么低的值：

[myapi@myapi-6fcdb94fd9-sptbb /opt/]$ uptime
09:49:58 up 47 min,  0 users,  load average: 0.48, 0.64, 1.23

知道为什么它展示了一个完全不同的东西吗。为什么hpa显示当前CPU利用率的200%？正因为如此，它在空闲时也使用了最大的复制副本。知道吗？

HPA的targetCPUUtilizationPercentage是目标Pod容器的CPU请求的百分比。如果您在Pod规范中没有指定任何CPU请求，HPA将无法进行计算。

在您的情况下，HPA似乎假定100m为CPU请求(或者您有一个将默认CPU请求设置为100m的LimitRange(。您的Pods当前的使用率约为200m，这就是HPA显示使用率约200%的原因。

要正确设置HPA，您需要为Pods指定CPU请求。类似于：

containers:
- image: eu.gcr.io/myproject/my-api:latest
name: myapi
imagePullPolicy: Always
ports:
- containerPort: 8080
name: myapi
resources:
requests:
cpu: 500m

或者你的播客需要什么价值。如果将targetCPUUtilizationPercentage设置为80，HPA将在使用400m时触发升级操作，因为80%的500m是400m。

除此之外，您还使用了过时版本的HorizontalPodAutoscaler:

您的版本：v1
最新版本：v2beta2

对于v2beta2版本，规范看起来有点不同。类似于：

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: myapi
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: myapi
minReplicas: 2
maxReplicas: 4
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 80

请参见示例。

然而，上述CPU利用机制仍然适用。

相关内容

最新更新

热门标签：