我的纱线容量为600 vcors和3600个内存。但是Admin团队已将纱线容器的最大内存配置为6 GB。我的用户有权分配尽可能多的容器。
当我尝试在50 GB作业的数据集上运行SPARK作业时,执行器内存的错误开销失败。
当一个容器内存不够的时候,为什么不能火花尝试新容器?
当一个容器内存不够的时候,为什么不能火花尝试新容器?
...因为Spark默认不执行此操作(并且 You 否则未对其进行配置(。
更重要的是,在spark-submit
时间内,您控制了执行者的数量以及CPU内核和RAM内存的总数。那就是--driver-memory
,--executor-memory
,--driver-cores
,--total-executor-cores
,--executor-cores
,--num-executors
等。
$ ./bin/spark-submit --help
...
--driver-memory MEM Memory for driver (e.g. 1000M, 2G) (Default: 1024M).
--driver-java-options Extra Java options to pass to the driver.
--driver-library-path Extra library path entries to pass to the driver.
--driver-class-path Extra class path entries to pass to the driver. Note that
jars added with --jars are automatically included in the
classpath.
--executor-memory MEM Memory per executor (e.g. 1000M, 2G) (Default: 1G).
...
Spark standalone with cluster deploy mode only:
--driver-cores NUM Cores for driver (Default: 1).
...
Spark standalone and Mesos only:
--total-executor-cores NUM Total cores for all executors.
Spark standalone and YARN only:
--executor-cores NUM Number of cores per executor. (Default: 1 in YARN mode,
or all available cores on the worker in standalone mode)
YARN-only:
--driver-cores NUM Number of cores used by the driver, only in cluster mode
(Default: 1).
--queue QUEUE_NAME The YARN queue to submit to (Default: "default").
--num-executors NUM Number of executors to launch (Default: 2).
If dynamic allocation is enabled, the initial number of
executors will be at least NUM.
...
有些是特定部署模式的,而另一些则依赖使用的群集管理器(在您的情况下是纱线(。
总结... you 决定使用spark-submit
的选项分配多少资源。
在Spark的正式文档中提交申请。