我正在尝试将应用程序从我的个人 docker 注册表部署到 Azure AKS pod 中。 我有只记录一些输出的python应用程序:
import time
import logging
logger = logging.getLogger('main')
logger.setLevel(logging.INFO)
handler = logging.StreamHandler(sys.stdout)
handler.setLevel(logging.DEBUG)
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
handler.setFormatter(formatter)
logger.addHandler(handler)
def main():
logger.info('This is test')
time.sleep(5)
while True:
try:
main()
except Exception:
logger.critical('Something critical.', exc_info=1)
logger.info('Sleep for 5 seconds')
time.sleep(5)
这是我的 Dockerfile:
FROM python:3.7-alpine
RUN apk update && apk upgrade
ARG APP_DIR=/app
RUN mkdir -p ${APP_DIR}
WORKDIR ${APP_DIR}
COPY requirements.txt .
RUN
apk add --no-cache --virtual .build-deps gcc python3-dev musl-dev linux-headers &&
python3 -m pip install -r requirements.txt --no-cache-dir &&
apk --purge del .build-deps
COPY app .
ENTRYPOINT [ "python", "-u", "run.py" ]
我可以在本地计算机上运行容器,这里有一些日志:
docker logs -tf my-container
2020-02-07T10:26:57.939062754Z 2020-02-07 10:26:57,938 - main - INFO - This is test
2020-02-07T10:27:02.944500969Z 2020-02-07 10:27:02,943 - main - INFO - Sleep for 5 seconds
2020-02-07T10:27:07.948643749Z 2020-02-07 10:27:07,948 - main - INFO - This is test
2020-02-07T10:27:12.953683767Z 2020-02-07 10:27:12,953 - main - INFO - Sleep for 5 seconds
2020-02-07T10:27:17.955954057Z 2020-02-07 10:27:17,955 - main - INFO - This is test
2020-02-07T10:27:22.960453835Z 2020-02-07 10:27:22,959 - main - INFO - Sleep for 5 seconds
2020-02-07T10:27:27.964402790Z 2020-02-07 10:27:27,963 - main - INFO - This is test
2020-02-07T10:27:32.968647112Z 2020-02-07 10:27:32,967 - main - INFO - Sleep for 5 seconds
我正在尝试使用此 yaml 文件部署 pod,kubectl apply -f onepod.yaml
:
apiVersion: v1
kind: Pod
metadata:
name: my-container
labels:
platform: xxx
event: yyy
protocol: zzz
spec:
imagePullSecrets:
- name: myregistry
containers:
- name: my-container
image: mypersonalregistry/my-container:test
Pod 已创建,但通过kubectl logs
命令保持CrashLoopBackOff
没有任何输出日志的状态。我试过kubectl describe pod
但在事件中没有任何用处:
Name: my-container
Namespace: default
Priority: 0
Node: aks-agentpool-56095163-vmss000000/10.240.0.4
Start Time: Fri, 07 Feb 2020 11:41:48 +0100
Labels: event=yyy
platform=xxx
protocol=zzz
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"v1","kind":"Pod","metadata":{"annotations":{},"labels":{"event":"yyy","platform":"xxx","protocol":"zzz"},"name":"my-container...
Status: Running
IP: 10.244.1.33
IPs: <none>
Containers:
my-container:
Container ID: docker://c497674f86deadca2ef874f8a94361e26c770314e9cff1729bf20b5943d1a700
Image: mypersonalregistry/my-container:test
Image ID: docker-pullable://mypersonalregistry/my-container@sha256:c4208f42fea9a99dcb3b5ad8b53bac5e39bc54b8d89a577f85fec1a94535bc39
Port: <none>
Host Port: <none>
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Fri, 07 Feb 2020 12:28:10 +0100
Finished: Fri, 07 Feb 2020 12:28:10 +0100
Ready: False
Restart Count: 14
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-lv75n (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
default-token-lv75n:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-lv75n
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 49m default-scheduler Successfully assigned default/my-container to aks-agentpool-56095163-vmss000000
Normal Pulled 48m (x5 over 49m) kubelet, aks-agentpool-56095163-vmss000000 Container image "mypersonalregistry/my-container:test" already present on machine
Normal Created 48m (x5 over 49m) kubelet, aks-agentpool-56095163-vmss000000 Created container my-container
Normal Started 48m (x5 over 49m) kubelet, aks-agentpool-56095163-vmss000000 Started container my-container
Warning BackOff 4m55s (x210 over 49m) kubelet, aks-agentpool-56095163-vmss000000 Back-off restarting failed container
我怎么知道,为什么它在我的计算机上工作,但在 kubernetes 集群中不起作用?
所以问题在于拉取最新版本的图像。更多在这里:
默认的拉取策略是 IfNotPresent,它会导致 Kubelet 跳过拉取图像(如果它已经存在(。
因此,它仍然运行带有标签test
的第一个版本的my-container
,即使它在我的注册表中也永远不会下载新版本。
解决方案是将此行添加到yaml文件中:
imagePullPolicy: Always
你看到的是100%的预期。应用程序休眠 10 秒钟并退出。Kubernetes 希望 Pod 无限期运行。如果 pod 因任何原因退出(即使退出代码为 0( - Kubernetes 将尝试重新启动它。如果 pod 多次退出 - Kubernetes 假设你的 pod 工作不正常,并将其状态更改为 CrashloopingBackoff。
你可以尝试将代码更改为在无限循环中运行,你会看到 Kubernetes 会对此感到满意。
如果你想运行任务来完成 - 你可能想要使用 Kubernetes Jobs。Kubernetes 预计乔布斯将以退出代码 0 完成。