由于反亲和规则,升级到Kubernetes 1.21后 Pod不再可部署



我们将Kubernetes集群(运行在GKE上)从1.19版本升级到1.21版本,从那时起我们就无法部署我们的部署。部署的相关部分定义如下:

apiVersion: apps/v1
kind: Deployment
metadata:
name: my-deployment
labels:
name: my-deployment
spec:
replicas: 2
revisionHistoryLimit: 10
strategy:
type: "RollingUpdate"
rollingUpdate:
maxUnavailable: 0
maxSurge: 1
selector:
matchLabels:
name: "my-deployment"
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: name
operator: In
values:
- my-deployment
- my-other-deployment
topologyKey: "kubernetes.io/hostname"
nodeSelector:
cloud.google.com/gke-nodepool: somenodepool
...

我们正在运行一个5节点集群和"my-other-deployment";只有一个豆荚的复制品。因此,在开始推出流程之前,应该有两个节点可用于调度新的"my-deployment"。豆荚。多年来,这种方法一直运行良好,但在将服务器升级到v1.21.10-gke版本之后。2000年,推出过程现在失败了:

Events:
Type     Reason             Age                From                Message
----     ------             ----               ----                -------
Warning  FailedScheduling   50s (x2 over 52s)  default-scheduler   0/5 nodes are available: 1 Insufficient cpu, 1 node(s) didn't satisfy existing pods anti-affinity rules, 3 node(s) didn't match pod anti-affinity rules, 4 node(s) didn't match pod affinity/anti-affinity rules.
Normal   NotTriggerScaleUp  50s                cluster-autoscaler  pod didn't trigger scale-up:
Normal   Scheduled          20s                default-scheduler   Successfully assigned default/my-deployment-7f66984b9f-bqs8l to gke-v1-21-10-gke-2000-n1-standar-9b2c965a-lz4j
Normal   Pulled             19s                kubelet             Container image "somerepo/something/my-deployment:589" already present on machine
Normal   Created            19s                kubelet             Created container my-deployment
Normal   Started            19s                kubelet             Started container my-deployment

这可能是什么原因,我们如何解决它?

我不知道1.19和1.21在(反)亲和力方面有什么变化。也许检查:

  • 是否有其他具有相同名称的部署,触发反关联?
  • 节点池名称是否正确?
  • 节点池中的所有节点都是可调度的吗?

问题是剩余节点上没有足够的CPU可用来满足pod的CPU资源请求。强制执行的方式可能在1.20或1.21中发生了变化,因为它之前不是一个问题。

相关内容

  • 没有找到相关文章

最新更新