为什么失败的startupProbe不杀死Pod,但允许它运行?



我创建了一个启动探针,使它总是失败。它应该导致pod被杀死并重新启动,但它没有。我看到一个启动探测失败的事件(之后没有任何事件),但是pod显示为1/1 Running。当我运行头盔测试时,它通过了!

我通过为启动探针检查设置无效的用户名密码来保证失败。

Using K8s version: 1.19.4

当我检查事件时,得到:

4m44s       Normal    SuccessfulCreate    replicaset/mysqlpod-5957645967   Created pod: mysqlpod-5957645967-fj95t
4m44s       Normal    ScalingReplicaSet   deployment/mysqlpod              Scaled up replica set mysqlpod-5957645967 to 1
4m44s       Normal    Scheduled           pod/mysqlpod-5957645967-fj95t    Successfully assigned data-layer/mysqlpod-5957645967-fj95t to minikube
4m43s       Normal    Created             pod/mysqlpod-5957645967-fj95t    Created container mysql
4m43s       Normal    Pulled              pod/mysqlpod-5957645967-fj95t    Container image "mysql:5.6" already present on machine
4m43s       Normal    Started             pod/mysqlpod-5957645967-fj95t    Started container mysql
4m41s       Warning   Unhealthy           pod/mysqlpod-5957645967-fj95t    Startup probe failed: Warning: Using a password on the command line interface can be insecure.
mysqladmin: connect to server at 'localhost' failed
error: 'Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)'
Check that mysqld is running and that the socket: '/var/run/mysqld/mysqld.sock' exists!

检查pod,我看到(使用--watch):

NAME                            READY   STATUS    RESTARTS   AGE
mysql-db-app-5957645967-fj95t   0/1     Running   0          7m18s
mysql-db-app-5957645967-fj95t   1/1     Running   0          7m43s

注意为0重启。

My Deployment has:

apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "mysqlapp.name" . }}
namespace: {{ quote .Values.metadata.namespace }}
spec:
replicas: {{ .Values.deploymentSpecs.replicas}}
selector:
matchLabels:
{{- include "mysqlapp.selectorLabels" . | nindent 6 }}
template:
metadata:
labels:
{{- include "mysqlapp.selectorLabels" . | nindent 8 }}
spec:
containers:
- image: "{{ .Values.image.name }}:{{ .Values.image.tag }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
name: {{ .Values.image.name }}
env:
- name: MYSQL_ROOT_PASSWORD
valueFrom: 
secretKeyRef:
name: db-credentials
key: db-password
ports:
- containerPort: {{ .Values.ports.containerPort }}
name: {{ .Values.image.name }}
startupProbe:
exec:
command:
- /bin/sh
- -c
- mysqladmin ping -u wrong -pwrong
periodSeconds: {{ .Values.startupProbe.periodSeconds }}
timeoutSeconds: {{ .Values.startupProbe.timeoutSeconds }}
successThreshold: {{ .Values.startupProbe.successThreshold }}
failureThreshold: {{ .Values.startupProbe.failureThreshold }}

注意上面的- mysqladmin ping -u wrong -pwrong

Values.yaml:

metadata:
namespace: data-layer
myprop: value
deploymentSpecs:
replicas: 1
labels:
app: db-service
image:
name: mysql
pullPolicy: IfNotPresent
tag: "5.6"
ports:
containerPort: 3306
startupProbe:
periodSeconds: 10
timeoutSeconds: 2
successThreshold: 1
failureThreshold: 5

即使等待5分钟,我仍然能够运行测试(使用MySql客户端到达DB),它工作!为什么这不会失败?

它没有失败,因为事实证明ping命令返回0状态,即使用户/通过是错误的,只要它可以到达服务器。

MySql ping命令

检查服务器是否可用。如果服务器正在运行,mysqladmin返回的状态是0,如果服务器未运行,则返回1。即使在诸如Access denied这样的错误情况下,这个值也是0,因为这意味着服务器正在运行,但是拒绝了连接,这与服务器没有运行是不同的。

强制失败并重新启动,您可以使用:

mysqladmin ping -u root -p${MYSQL_ROOT_PASSWORD} --host fake

相关内容

  • 没有找到相关文章

最新更新