>我正在尝试使用命令从本地机器 CLI 在 minikube(Kubernetes( 上进行 spark-submit
spark-submit --master k8s://https://127.0.0.1:8001 --name cfe2
--deploy-mode cluster --class com.yyy.Test --conf spark.executor.instances=2 --conf spark.kubernetes.container.image docker.io/anantpukale/spark_app:1.1 local://spark-0.0.1-SNAPSHOT.jar
我有一个简单的火花工作罐子,基于 verison 2.3.0 构建。我还在 docker 和 minikube 中将其容器化,并在虚拟盒子上运行。下面是异常堆栈:
Exception in thread "main" org.apache.spark.SparkException: Must specify the driver container image at org.apache.spark.deploy.k8s.submit.steps.BasicDriverConfigurationStep$$anonfun$3.apply(BasicDriverConfigurationStep.scala:51) at org.apache.spark.deploy.k8s.submit.steps.BasicDriverConfigurationStep$$anonfun$3.apply(BasicDriverConfigurationStep.scala:51) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.deploy.k8s.submit.steps.BasicDriverConfigurationStep.<init>(BasicDriverConfigurationStep.scala:51)
at org.apache.spark.deploy.k8s.submit.DriverConfigOrchestrator.getAllConfigurationSteps(DriverConfigOrchestrator.scala:82)
at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication$$anonfun$run$5.apply(KubernetesClientApplication.scala:229)
at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication$$anonfun$run$5.apply(KubernetesClientApplication.scala:227)
at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2585)
at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:227)
at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:192)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:879)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:197)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:227)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:136)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 2018-04-06 13:33:52 INFO ShutdownHookManager:54 - Shutdown hook called 2018-04-06 13:33:52 INFO ShutdownHookManager:54 - Deleting directory C:UsersanantAppDataLocalTempspark-6da93408-88cb-4fc7-a2de-18ed166c3c66
看起来像参数默认值的错误 spark.kubernetes.driver.container.image
,必须spark.kubernetes.container.image
。因此,请尝试直接指定驱动程序/执行程序容器映像:
- spark.kubernetes.driver.container.image
- spark.kubernetes.executor.container.image
从源代码来看,唯一可用的 conf 选项是:
spark.kubernetes.container.image
spark.kubernetes.driver.container.image
spark.kubernetes.executor.container.image
我注意到,与 2.2.0 相比,Spark 2.3.0 在 k8s 实现方面发生了很大变化。例如,官方入门指南不是单独指定驱动程序和执行器,而是使用提供给spark.kubernetes.container.image
的单个图像。
看看这是否有效:
spark-submit
--master k8s://http://127.0.0.1:8001
--name cfe2
--deploy-mode cluster
--class com.oracle.Test
--conf spark.executor.instances=2
--conf spark.kubernetes.container.image=docker/anantpukale/spark_app:1.1
--conf spark.kubernetes.authenticate.submission.oauthToken=YOUR_TOKEN
--conf spark.kubernetes.authenticate.submission.caCertFile=PATH_TO_YOUR_CERT
local://spark-0.0.1-SNAPSHOT.jar
令牌和证书可以在 k8s 仪表板上找到。按照说明制作与 Spark 2.3.0 兼容的 docker 映像。