透视
实际上我需要配置两个服务文件。一个用于Spark Master,另一个用于Spark Slave(Worker(节点。请查找以下环境和服务配置:
Cofigurations
/opt/cli/spark-3.3.0-bin-hadoop3/etc/env
JAVA_HOME="/usr/lib/jvm/java-17-openjdk-amd64"
SPARK_HOME="/opt/cli/spark-3.3.0-bin-hadoop3"
PYSPARK_PYTHON="/usr/bin/python3"
/etc/systemd/system/spark-master.service
[Unit]
Description=Apache Spark Master
Wants=network-online.target
After=network-online.target
[Service]
User=spark
Group=spark
Type=forking
WorkingDirectory=/opt/cli/spark-3.3.0-bin-hadoop3/sbin
EnvironmentFile=/opt/cli/spark-3.3.0-bin-hadoop3/etc/env
ExecStartPost=/bin/bash -c "echo $MAINPID > /opt/cli/spark-3.3.0-bin-hadoop3/etc/spark-master.pid"
ExecStart=/opt/cli/spark-3.3.0-bin-hadoop3/sbin/start-master.sh
ExecStop=/opt/cli/spark-3.3.0-bin-hadoop3/sbin/stop-master.sh
[Install]
WantedBy=multi-user.target
/etc/systemd/system/spark-slave.service
[Unit]
Description=Apache Spark Slave
Wants=network-online.target
After=network-online.target
[Service]
User=spark
Group=spark
Type=forking
WorkingDirectory=/opt/cli/spark-3.3.0-bin-hadoop3/sbin
EnvironmentFile=/opt/cli/spark-3.3.0-bin-hadoop3/etc/env
ExecStartPost=/bin/bash -c "echo $MAINPID > /opt/cli/spark-3.3.0-bin-hadoop3/etc/spark-slave.pid"
ExecStart=/opt/cli/spark-3.3.0-bin-hadoop3/sbin/start-slave.sh spark://spark.cdn.chorke.org:7077
ExecStop=/opt/cli/spark-3.3.0-bin-hadoop3/sbin/stop-slave.sh
[Install]
WantedBy=multi-user.target
结果
它已成功启动,但由于某些错误而未能成功停止!事实上,使用Systemd 无法阻止Apache Spark Master或Slave
Spark Master停止状态
× spark-master.service - Apache Spark Master
Loaded: loaded (/etc/systemd/system/spark-master.service; disabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Mon 2022-09-26 18:43:39 +08; 8s ago
Docs: https://spark.apache.org/docs/3.3.0
Process: 488887 ExecStart=/opt/cli/spark-3.3.0-bin-hadoop3/sbin/start-master.sh (code=exited, status=0/SUCCESS)
Process: 489000 ExecStartPost=/bin/bash -c echo $MAINPID > /opt/cli/spark-3.3.0-bin-hadoop3/etc/spark-master.pid (code=exited, status=0/SUCCESS)
Process: 489484 ExecStop=/opt/cli/spark-3.3.0-bin-hadoop3/sbin/stop-master.sh (code=exited, status=0/SUCCESS)
Main PID: 488903 (code=exited, status=143)
CPU: 4.813s
Spark Slave停止状态
× spark-slave.service - Apache Spark Slave
Loaded: loaded (/etc/systemd/system/spark-slave.service; disabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Mon 2022-09-26 18:38:22 +08; 15s ago
Docs: https://spark.apache.org/docs/3.3.0
Process: 489024 ExecStart=/opt/cli/spark-3.3.0-bin-hadoop3/sbin/start-slave.sh spark://ns12-pc04:7077 (code=exited, status=0/SUCCESS)
Process: 489145 ExecStartPost=/bin/bash -c echo $MAINPID > /opt/cli/spark-3.3.0-bin-hadoop3/etc/spark-slave.pid (code=exited, status=0/SUCCESS)
Process: 489174 ExecStop=/opt/cli/spark-3.3.0-bin-hadoop3/sbin/stop-slave.sh (code=exited, status=0/SUCCESS)
Main PID: 489040 (code=exited, status=143)
CPU: 4.306s
预期行为
您的指导方针将非常感谢关闭Master&从节点没有任何错误。
理论解
在这种情况下,您必须编写自己的脚本来操作关闭,以强制退出代码0
而不是143
。如果你像我一样足够空闲,那么你可以将SuccessExitStatus
从0
更改为143
。默认情况下,查找SuccessExitStatus
代码的systemd
单元测试为0。我们需要更改默认的单元测试行为。
实用解决方案
/etc/systemd/system/spark-master.service
[Unit]
Description=Apache Spark Master
Wants=network-online.target
After=network-online.target
[Service]
User=spark
Group=spark
Type=forking
SuccessExitStatus=143
WorkingDirectory=/opt/cli/spark-3.3.0-bin-hadoop3/sbin
EnvironmentFile=/opt/cli/spark-3.3.0-bin-hadoop3/etc/env
ExecStartPost=/bin/bash -c "echo $MAINPID > /opt/cli/spark-3.3.0-bin-hadoop3/etc/spark-master.pid"
ExecStart=/opt/cli/spark-3.3.0-bin-hadoop3/sbin/start-master.sh
ExecStop=/opt/cli/spark-3.3.0-bin-hadoop3/sbin/stop-master.sh
[Install]
WantedBy=multi-user.target
/etc/systemd/system/spark-slave.service
[Unit]
Description=Apache Spark Slave
Wants=network-online.target
After=network-online.target
[Service]
User=spark
Group=spark
Type=forking
SuccessExitStatus=143
WorkingDirectory=/opt/cli/spark-3.3.0-bin-hadoop3/sbin
EnvironmentFile=/opt/cli/spark-3.3.0-bin-hadoop3/etc/env
ExecStartPost=/bin/bash -c "echo $MAINPID > /opt/cli/spark-3.3.0-bin-hadoop3/etc/spark-slave.pid"
ExecStart=/opt/cli/spark-3.3.0-bin-hadoop3/sbin/start-slave.sh spark://spark.cdn.chorke.org:7077
ExecStop=/opt/cli/spark-3.3.0-bin-hadoop3/sbin/stop-slave.sh
[Install]
WantedBy=multi-user.target