Spark on Kubernetes驱动程序pod清理

我正在kubernetes 1.19上运行spark 3.1.1。一旦作业完成，执行器吊舱将被清理，但驱动程序吊舱仍处于完成状态。如何清理驱动器吊舱一旦完成？要设置什么配置选项？

NAME                                           READY   STATUS      RESTARTS   AGE
my-job-0e85ea790d5c9f8d-driver                 0/1     Completed   0          2d20h
my-job-8c1d4f79128ccb50-driver                 0/1     Completed   0          43h
my-job-c87bfb7912969cc5-driver                 0/1     Completed   0          43h

关于初始问题"Spark on Kubernetes驱动程序pod cleanup"；，在spark提交时，似乎没有办法将TTL参数传递给kubernetes，以避免在完成状态下永远不会删除驱动程序pod。

来自Spark文档：https://spark.apache.org/docs/latest/running-on-kubernetes.html当应用程序完成时，执行器pod会终止并被清理，但驱动程序pod会保留日志，并在Kubernetes API中保持"已完成"状态，直到最终被垃圾收集或手动清理

目前还不太清楚是谁在做这件"最终垃圾回收"的事。

spark.kubernetes.driver.service.deleteOnTermination添加到3.2.0中的spark中。这应该能解决问题。src:https://spark.apache.org/docs/latest/core-migration-guide.html

更新：这只会删除pod的服务。。但不是吊舱本身

根据Kubernetes 1.12以来的官方文档：

自动清理已完成作业(完成或失败(的另一种方法是通过指定作业的.spec.ttlSecondsFafterFinished字段，使用TTL控制器为已完成资源提供的TTL机制。当TTL控制器清理作业时，它将级联地删除作业，即删除其依赖对象，如Pods，以及作业。请注意，删除作业时，将遵守其生命周期保证，如终结器。

示例：

apiVersion: batch/v1
kind: Job
metadata:
  name: pi-with-ttl
spec:
  ttlSecondsAfterFinished: 100
  template:
    spec:
      ...

带有ttl的Job pi将有资格在完成100秒后自动删除。如果该字段设置为0，则作业将有资格在完成后立即自动删除。

如果无法自定义Job资源，则可以使用外部工具清理已完成的作业。例如检查https://github.com/dtan4/k8s-job-cleaner

Kubernetes确实有一个Pod生命周期，并运行一个PodGC，在那里它将清理"；失败"；或"；"成功"；阶段看起来它只有在达到terminated-pod-gc-threshold确定的阈值时才会执行此操作。

Garbage collection of Pods 
For failed Pods, the API objects remain in the cluster's API until a human or controller process explicitly removes them.
The Pod garbage collector (PodGC), which is a controller in the control plane, cleans up terminated Pods (with a phase of Succeeded or Failed), when the number of Pods exceeds the configured threshold (determined by terminated-pod-gc-threshold in the kube-controller-manager). This avoids a resource leak as Pods are created and terminated over time.

来源：https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-垃圾收集

相关内容

最新更新

热门标签：