我有一个EKS集群,其中有一个守护进程,它将s3存储桶装载到所有pod。
每当出现问题或pod重新启动时,装载卷都无法访问,并引发以下错误。
Transport endpoint is not connected
为了解决这个错误,我必须手动卸载卷并重新启动守护程序。
umount /mnt/data-s3-fuse
这个问题的永久解决方案是什么?
我的守护程序文件
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
labels:
app: s3-provider
name: s3-provider
namespace: airflow
spec:
template:
metadata:
labels:
app: s3-provider
spec:
containers:
- name: s3fuse
image: image
lifecycle:
preStop:
exec:
command: ["/bin/sh","-c","umount -f /opt/airflow/dags"]
securityContext:
privileged: true
capabilities:
add:
- SYS_ADMIN
# use ALL entries in the config map as environment variables
envFrom:
- configMapRef:
name: s3-config
volumeMounts:
- name: devfuse
mountPath: /dev/fuse
- name: mntdatas3fs
mountPath: /opt/airflow/dags:shared
volumes:
- name: devfuse
hostPath:
path: /dev/fuse
- name: mntdatas3fs
hostPath:
path: /mnt/data-s3-fuse
我的吊舱是
apiVersion: v1
kind: Pod
metadata:
name: test-pd
namespace: airflow
spec:
containers:
- image: nginx
name: s3-test-container
securityContext:
privileged: true
volumeMounts:
- name: mntdatas3fs
mountPath: /opt/airflow/dags:shared
livenessProbe:
exec:
command: ["ls", "/opt/airflow/dags"]
failureThreshold: 3
initialDelaySeconds: 10
periodSeconds: 5
successThreshold: 1
timeoutSeconds: 1
volumes:
- name: mntdatas3fs
hostPath:
path: /mnt/data-s3-fuse
我正在使用下面的代码用于s3 kubernetes保险丝。
https://github.com/freegroup/kube-s3
好的,我想我解决了。似乎有时吊舱会失去连接,导致";传输未连接";。我找到的解决方法是添加一个init容器,该容器以前尝试卸载文件夹。这似乎解决了问题。请注意,您希望装载一个更高级别的文件夹,这样您就可以访问该节点。会让它运行,看看它是否会回来,它似乎已经在这里解决了一次问题:
apiVersion: apps/v1
kind: DaemonSet
metadata:
labels:
app: s3-provider
name: s3-provider
spec:
selector:
matchLabels:
app: s3-provider
template:
metadata:
labels:
app: s3-provider
spec:
initContainers:
- name: init-myservice
image: bash
command: ['bash', '-c', 'umount -l /mnt/data-s3-fs/root ; true']
securityContext:
privileged: true
capabilities:
add:
- SYS_ADMIN
# use ALL entries in the config map as environment variables
envFrom:
- configMapRef:
name: s3-config
volumeMounts:
- name: devfuse
mountPath: /dev/fuse
- name: mntdatas3fs-init
mountPath: /mnt:shared
containers:
- name: s3fuse
image: 963341077747.dkr.ecr.us-east-1.amazonaws.com/kube-s3:1.0
imagePullPolicy: Always
lifecycle:
preStop:
exec:
command: ["bash", "-c", "umount -f /srv/s3-mount/root"]
securityContext:
privileged: true
capabilities:
add:
- SYS_ADMIN
# use ALL entries in the config map as environment variables
envFrom:
- configMapRef:
name: s3-config
env:
- name: S3_BUCKET
value: s3-mount
- name: MNT_POINT
value: /srv/s3-mount/root
- name: IAM_ROLE
value: none
volumeMounts:
- name: devfuse
mountPath: /dev/fuse
- name: mntdatas3fs
mountPath: /srv/s3-mount/root:shared
volumes:
- name: devfuse
hostPath:
path: /dev/fuse
- name: mntdatas3fs
hostPath:
type: DirectoryOrCreate
path: /mnt/data-s3-fs/root
- name: mntdatas3fs-init
hostPath:
type: DirectoryOrCreate
path: /mnt
对我来说,解决方案有一个preStop
挂钩事件,用于在pod退出之前卸载路径:
containers:
- name: aws-sync
lifecycle:
preStop:
exec:
command: ['bash', '-c', 'umount -l /mounted/path; true']