在群集模式下将 Spark 从 eclipse 部署到 YARN 时出错

我正在尝试将Apache Spark Pi示例从Eclipse部署到Hadoop YARN。我正在运行自己的集群，其中包含 3 个装有 Linux 的虚拟机。集群中的Hadoop版本是2.7.2，Spark是1.6.0，Hadoop 2.6.0的预构建版本更高。我能够从节点运行 Pi 示例，但是当我想在窗口（纱线集群模式）上从 eclipse 运行 java Pi 示例时，我收到如下所示的错误。我发现了几个带有此错误的线程，但其中大多数是针对带有一些额外变量的 cloudera 或 hortonwork 的，或者没有解决我的问题。我也尝试了 YARN 客户端模式，结果相同。有人可以帮助我吗，请求？

Eclipse 控制台输出：

16/02/23 11:21:51 INFO Client: Requesting a new application from cluster with 1 NodeManagers
16/02/23 11:21:51 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
16/02/23 11:21:51 INFO Client: Will allocate AM container, with 1384 MB memory including 384 MB overhead
16/02/23 11:21:51 INFO Client: Setting up container launch context for our AM
16/02/23 11:21:51 INFO Client: Setting up the launch environment for our AM container
16/02/23 11:21:51 INFO Client: Preparing resources for our AM container
16/02/23 11:21:53 WARN : Your hostname, uherpc resolves to a loopback/non-reachable address: 172.25.32.214, but we couldn't find any external IP address!
16/02/23 11:21:53 INFO Client: Uploading resource file:/C:/Users/xuherv00/.gradle/caches/modules-2/files-2.1/org.apache.spark/spark-yarn_2.10/1.6.0/ace7b1f6f0c33b48e0323b7b0e7dd8ab458c14a4/spark-yarn_2.10-1.6.0.jar -> hdfs://sparkmaster:9000/user/hduser/.sparkStaging/application_1456222391080_0002/spark-yarn_2.10-1.6.0.jar
16/02/23 11:21:54 INFO Client: Uploading resource file:/C:/Users/xuherv00/workspace/rapidminer5/RapidMiner_Extension_Streaming/lib/spark-examples-1.6.0-hadoop2.6.0.jar -> hdfs://sparkmaster:9000/user/hduser/.sparkStaging/application_1456222391080_0002/spark-examples-1.6.0-hadoop2.6.0.jar
16/02/23 11:22:02 INFO Client: Uploading resource file:/C:/Users/xuherv00/AppData/Local/Temp/spark-266eade5-5049-4b13-9f75-edb5200e3df1/__spark_conf__6296221515875913107.zip -> hdfs://sparkmaster:9000/user/hduser/.sparkStaging/application_1456222391080_0002/__spark_conf__6296221515875913107.zip
16/02/23 11:22:02 INFO SecurityManager: Changing view acls to: xuherv00,hduser
16/02/23 11:22:02 INFO SecurityManager: Changing modify acls to: xuherv00,hduser
16/02/23 11:22:02 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(xuherv00, hduser); users with modify permissions: Set(xuherv00, hduser)
16/02/23 11:22:02 INFO Client: Submitting application 2 to ResourceManager
16/02/23 11:22:02 INFO YarnClientImpl: Submitted application application_1456222391080_0002
16/02/23 11:22:03 INFO Client: Application report for application_1456222391080_0002 (state: ACCEPTED)
16/02/23 11:22:03 INFO Client: 
     client token: N/A
     diagnostics: N/A
     ApplicationMaster host: N/A
     ApplicationMaster RPC port: -1
     queue: default
     start time: 1456222434780
     final status: UNDEFINED
     tracking URL: http://sparkmaster:8088/proxy/application_1456222391080_0002/
     user: hduser
16/02/23 11:22:04 INFO Client: Application report for application_1456222391080_0002 (state: ACCEPTED)
16/02/23 11:22:05 INFO Client: Application report for application_1456222391080_0002 (state: ACCEPTED)
16/02/23 11:22:06 INFO Client: Application report for application_1456222391080_0002 (state: ACCEPTED)
16/02/23 11:22:07 INFO Client: Application report for application_1456222391080_0002 (state: ACCEPTED)
16/02/23 11:22:08 INFO Client: Application report for application_1456222391080_0002 (state: ACCEPTED)
16/02/23 11:22:09 INFO Client: Application report for application_1456222391080_0002 (state: ACCEPTED)
16/02/23 11:22:10 INFO Client: Application report for application_1456222391080_0002 (state: FAILED)
16/02/23 11:22:10 INFO Client: 
     client token: N/A
     diagnostics: Application application_1456222391080_0002 failed 2 times due to AM Container for appattempt_1456222391080_0002_000002 exited with  exitCode: 1
For more detailed output, check application tracking page:http://sparkmaster:8088/cluster/app/application_1456222391080_0002Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1456222391080_0002_02_000001
Exit code: 1
Stack trace: ExitCodeException exitCode=1: 
    at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
    at org.apache.hadoop.util.Shell.run(Shell.java:456)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
    at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

Container exited with a non-zero exit code 1
Failing this attempt. Failing the application.
     ApplicationMaster host: N/A
     ApplicationMaster RPC port: -1
     queue: default
     start time: 1456222434780
     final status: FAILED
     tracking URL: http://sparkmaster:8088/cluster/app/application_1456222391080_0002
     user: hduser
16/02/23 11:22:10 INFO Client: Deleting staging directory .sparkStaging/application_1456222391080_0002
16/02/23 11:22:10 INFO ShutdownHookManager: Shutdown hook called
16/02/23 11:22:10 INFO ShutdownHookManager: Deleting directory C:Usersxuherv00AppDataLocalTempspark-266eade5-5049-4b13-9f75-edb5200e3df1

Hadoop 应用程序日志：

User:   hduser
Name:   testApp
Application Type:   SPARK
Application Tags:   
YarnApplicationState:   FAILED
FinalStatus Reported by AM:     FAILED
Started:    Tue Feb 23 11:13:54 +0100 2016
Elapsed:    8sec
Tracking URL:   History
Diagnostics:    
Application application_1456222391080_0002 failed 2 times due to AM Container for appattempt_1456222391080_0002_000002 exited with exitCode: 1
For more detailed output, check application tracking page:http://sparkmaster:8088/cluster/app/application_1456222391080_0002Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1456222391080_0002_02_000001
Exit code: 1
Stack trace: ExitCodeException exitCode=1:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
at org.apache.hadoop.util.Shell.run(Shell.java:456)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 1
Failing this attempt. Failing the application.

container_1456222391080_0002_01_000001 stderr 的日志：

Error: Could not find or load main class org.apache.spark.deploy.yarn.ApplicationMaster

Gradle 依赖项：

    //hadoop
compile 'org.apache.hadoop:hadoop-common:2.7.2'
compile 'org.apache.hadoop:hadoop-hdfs:2.7.2'
compile 'org.apache.hadoop:hadoop-client:2.7.2'
compile 'org.apache.hadoop:hadoop-mapreduce-client-core:2.7.2'
compile 'org.apache.hadoop:hadoop-yarn-common:2.7.2'
compile 'org.apache.hadoop:hadoop-yarn-api:2.7.2'
//spark
compile 'org.apache.spark:spark-core_2.10:1.6.0'
compile 'org.apache.spark:spark-sql_2.10:1.6.0'
compile 'org.apache.spark:spark-streaming_2.10:1.6.0'
compile 'org.apache.spark:spark-catalyst_2.10:1.6.0'
compile 'org.apache.spark:spark-yarn_2.10:1.6.0'
compile 'org.apache.spark:spark-network-shuffle_2.10:1.6.0'
compile 'org.apache.spark:spark-network-common_2.10:1.6.0'
compile 'org.apache.spark:spark-network-yarn_2.10:1.6.0'

爪哇类：

package mypackage;
import java.io.File;
import java.lang.reflect.Method;
import java.net.URI;
import java.net.URL;
import java.net.URLClassLoader;
import org.apache.hadoop.conf.Configuration;
import org.apache.spark.SparkConf;
import org.apache.spark.deploy.yarn.Client;
import org.apache.spark.deploy.yarn.ClientArguments;
import org.junit.Test;
public class ExampleApp {
    private String appName = "testApp";
//  private String mode = "yarn-client";
    private String mode = "yarn-cluster";
    private File appJar = new File("lib/spark-examples-1.6.0-hadoop2.6.0.jar");
    private URI appJarUri = appJar.toURI();
    private String hadoopPath = "E:\store\hadoop";
    @Test
    public void deployPiToYARN() {
        String[] args = new String[] {
                // the name of your application
                "--name", appName,
                // memory for driver (optional)
                "--driver-memory", "1000M",
                // path to your application's JAR file
                // required in yarn-cluster mode
                "--jar", appJarUri.toString(),
                // name of your application's main class (required)
                "--class", "org.apache.spark.examples.SparkPi",
                // comma separated list of local jars that want
                // SparkContext.addJar to work with
//              "--addJars",
//              "lib/spark-assembly-1.6.0-hadoop2.6.0.jar",
                // argument 1 to Spark program
                 "--arg",
                 "10",
        };
        System.setProperty("hadoop.home.dir", hadoopPath);
        System.setProperty("HADOOP_USER_NAME", "hduser");
        try {
            addHadoopConfToClassPath(hadoopPath);
        } catch (Exception e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
        // create a Hadoop Configuration object
        Configuration config = new Configuration();
        config.set("yarn.resourcemanager.address", "172.25.32.192:8050");
        // identify that you will be using Spark as YARN mode
        System.setProperty("SPARK_YARN_MODE", "true");
        // create an instance of SparkConf object
        SparkConf sparkConf = new SparkConf().setAppName(appName);
        sparkConf = sparkConf.setMaster(mode);
        sparkConf = sparkConf.set("spark.executor.memory","1g");
        // create ClientArguments, which will be passed to Client
        ClientArguments cArgs = new ClientArguments(args, sparkConf);
        // create an instance of yarn Client client
        Client client = new Client(cArgs, config, sparkConf);
//      client.submitApplication();
        // submit Spark job to YARN
        client.run();
    }
    private void addHadoopConfToClassPath(String path) throws Exception {
        File f = new File(path);
        URL u = f.toURI().toURL();
        URLClassLoader urlClassLoader = (URLClassLoader) ClassLoader.getSystemClassLoader();
        Class<URLClassLoader> urlClass = URLClassLoader.class;
        Method method = urlClass.getDeclaredMethod("addURL", new Class[]{URL.class});
        method.setAccessible(true);
        method.invoke(urlClassLoader, new Object[]{u});
    }
}

核心站点.xml：

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
    <property>
        <name>fs.default.name</name>
        <value>hdfs://sparkmaster:9000</value>
    </property>
</configuration>

hdfs-site.xml：

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
        <property>
                <name>dfs.replication</name>
                <value>3</value>
        </property>
        <property>
                <name>dfs.namenode.name.dir</name>
        <value>file:/usr/local/hadoop_tmp/hdfs/namenode</value>
        </property>
        <property>
                <name>dfs.datanode.data.dir</name>
                <value>file:/usr/local/hadoop_tmp/hdfs/datanode</value>
        </property>
</configuration>

纱线站点.xml：

<configuration>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
  <property>
    <name>yarn.resourcemanager.hostname</name>
    <value>sparkmaster</value>
  </property>
  <property>
    <name>yarn.resourcemanager.resource-tracker.address</name>
    <value>sparkmaster:8025</value>
  </property>
  <property>
    <name>yarn.resourcemanager.scheduler.address</name>
    <value>sparkmaster:8035</value>
  </property>
  <property>
    <name>yarn.resourcemanager.address</name>
    <value>sparkmaster:8050</value>
    </property>
</configuration>

你不应该从 eclipse 启动 spark on yarn，而应该使用 SparkSubmit。虽然你可以从日食中使用本地模式。

SparkSubmit 为你做了很多事情，包括将 spark jar 等依赖项上传到 yarn cluster，这些依赖项将由执行程序引用。这就是为什么你得到上述错误。

相关内容

最新更新

热门标签：