在软泥中使用spark java.lang.IllegalArgumentException:java.net.Unnk



我使用CDH 6.3.2

hadoop是HA

我用色调中的火花制作了一个工作流程

运行此工作流我得到一个错误

Failing Oozie Launcher, java.net.UnknownHostException: nameservice1
java.lang.IllegalArgumentException: java.net.UnknownHostException: nameservice1
at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:445)
at org.apache.hadoop.hdfs.NameNodeProxiesClient.createProxyWithClientProtocol(NameNodeProxiesClient.java:132)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:351)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:285)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:168)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3237)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:123)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3286)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3254)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:478)
at org.apache.spark.deploy.DependencyUtils$.org$apache$spark$deploy$DependencyUtils$$resolveGlobPath(DependencyUtils.scala:190)
at org.apache.spark.deploy.DependencyUtils$$anonfun$resolveGlobPaths$2.apply(DependencyUtils.scala:146)
at org.apache.spark.deploy.DependencyUtils$$anonfun$resolveGlobPaths$2.apply(DependencyUtils.scala:144)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:35)
at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104)
at org.apache.spark.deploy.DependencyUtils$.resolveGlobPaths(DependencyUtils.scala:144)
at org.apache.spark.deploy.SparkSubmit$$anonfun$prepareSubmitEnvironment$3.apply(SparkSubmit.scala:355)
at org.apache.spark.deploy.SparkSubmit$$anonfun$prepareSubmitEnvironment$3.apply(SparkSubmit.scala:355)
at scala.Option.map(Option.scala:146)
at org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:355)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:143)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:926)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:935)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
at org.apache.oozie.action.hadoop.SparkMain.runSpark(SparkMain.java:186)
at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:93)
at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:104)
at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:60)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.oozie.action.hadoop.LauncherAM.runActionMain(LauncherAM.java:410)
at org.apache.oozie.action.hadoop.LauncherAM.access$300(LauncherAM.java:55)
at org.apache.oozie.action.hadoop.LauncherAM$2.run(LauncherAM.java:223)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
at org.apache.oozie.action.hadoop.LauncherAM.run(LauncherAM.java:217)
at org.apache.oozie.action.hadoop.LauncherAM$1.run(LauncherAM.java:153)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
at org.apache.oozie.action.hadoop.LauncherAM.main(LauncherAM.java:141)
Caused by: java.net.UnknownHostException: nameservice1

在我的hdfs-site.xml 中

<property>
<name>dfs.nameservices</name>
<value>nameservice1</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.nameservice1</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled.nameservice1</name>
<value>true</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>bigdser2:2181,bigdser3:2181,bigdser5:2181</value>
</property>
<property>
<name>dfs.ha.namenodes.nameservice1</name>
<value>namenode337,namenode369</value>
</property>

我可以使用色调中的蜂箱运行工作流

我可以运行火花壳震击器hdfs://nameservice1/sparklib/*.jar

"hadoop fs-ls/user";作品

它只是不适用于oozie

那么如何修复它。有人可以帮我。

my job.properties

nameNode=hdfs://nameservice1
jobTracker=bigdser3:8032
queueName=default
oozie.use.system.libpath=True  
oozie.wf.application.path=${nameNode}/user/jztwk
security_enabled=True

我的工作流.xml

<workflow-app  name="Spark-example2" xmlns="uri:oozie:workflow:0.5">
<start to="SparkOozieAction1"/>
<action name="SparkOozieAction1">
<spark xmlns="uri:oozie:spark-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<master>yarn</master>
<mode>cluster</mode>
<name>Spark Example1</name>
<class>App</class>
<jar>JztCloudAnalyse.jar</jar>
<spark-opts>--jars hdfs://nameservice1/sparklib/*.jar  --conf spark.executor.extraJavaOptions=-Dfile.encoding=UTF-8 --conf spark.driver.extraJavaOptions=-Dfile.encoding=UTF-8</spark-opts>
<arg>-r 10.3.87.31:7000,10.3.87.31:7001,10.3.87.32:7002,10.3.87.32:7003,10.3.87.36:7004,10.3.87.36:7005 -d 0 -k 22 -w http://10.3.87.49:8082/SendMsgApi.ashx -n JZTanalyse10_102 -h JZTanalyse -o jjj</arg>
<file>/user/jztwk/JztCloudAnalyse.jar#JztCloudAnalyse.jar</file>
</spark>
<ok to="end"/>
<error to="kill"/>
</action>
<kill name="kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>
</workflow-app>

手动运行oozie"没有色调的menas";

它也得到相同的错误

删除jar 中的exclude resources/hdfs-site.xml resources/core-site.xml

我也有这个问题,使用oozie 5.2.1和apache hadoop 2.10.1这是因为applicationmaster无法从类路径中找到hdfs-site.xml,当我们使用spark-submit时,它会将HADOOP_CONF_DIR添加到类路径中

oozie还将其添加到类路径中,如$PWD:$PWD/:$HADOOP_CONF_DIR:$HADOOP_COMMON_HOME/share/HADOOP/COMMON/:。。。。。。

但我发现提交给rm的环境变量"HADOOP_CONF_DIR"oozie为空

检查下面的代码,它将HADOOP_CONF_DIR设置为HADOOP_CLIENT_CONF_DIR

private void setHadoopConfDirIfEmpty(Map<String, String> env) {
String envHadoopConfDir = env.get(HADOOP_CONF_DIR);
if (StringUtils.isEmpty(envHadoopConfDir)) {
String hadoopClientConfDirVariable = String.format("${%s}",HADOOP_CLIENT_CONF_DIR);
LOG.debug("Setting {0} environment variable to {1}.", HADOOP_CONF_DIR, hadoopClientConfDirVariable);
env.put(HADOOP_CONF_DIR, hadoopClientConfDirVariable);
}
else {
LOG.debug( "Environment variable {0} is already set to {1}.", HADOOP_CONF_DIR, envHadoopConfDir);
}
}

eventhrough我添加了与HADOOP_CONF_DIR值相同的env"HADOOP_CLIENT_CONF_DIR",我发现问题仍未解决,我不知道,然后我只找到了另一种方法来指定环境变量

在core-site.xml 中添加follow配置

<property>
<name>oozie.launcher.env</name>
<value>HADOOP_CONF_DIR=/usr/local/hadoop/hadoop-2.10.1/etc/hadoop/</value>
</property> 

最后,我用这种方式解决问题

相关内容

最新更新