Hadoop客户端。RMProxy:连接到ResourceManager
我在linux上设置单节点集群:http://tecadmin.net/setup-hadoop-2-4-single-node-cluster-on-linux/
当我运行mapreduce应用程序时,如下所示:hadoop jar hadoop-mapreduce-examples-2.6.0.jar grep input output 'dfs[a-z.]+
I got the ff INFO:
15/02/25 23:42:54 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
15/02/25 23:42:56 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
15/02/25 23:42:59 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
15/02/25 23:43:02 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
jps:
5232 SecondaryNameNode
6482 RunJar
5878 NodeManager
6521 Jps
4905 NameNode
5759 ResourceManager
5023 DataNode
设置单节点集群时如何连接到ResourceManager?
我尝试添加到yarn-site.xml
,但没有成功。
<property>
<name>yarn.resourcemanager.address</name>
<value>127.0.0.1:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>127.0.0.1:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>127.0.0.1:8031</value>
</property>
感谢
只需记住运行Hadoop的一个方面。给出了三种模式:独立模式、伪分布式模式和完全分布式模式。
独立和伪分布式在同一节点中运行。实际上,它们只在你的机器上运行。这不需要您显示的配置:http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.html
先验地,这就是yarn-site.xml中单个节点所需要的全部:
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
但是也可以使用进一步的配置。我的伪分布式模式的纱线站点是这样的:
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>localhost:8025</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>localhost:8030</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>localhost:8050</value>
</property>
提示:确保您在配置文件中键入的IP。我建议您将此IP添加到您的etc/主机并提供主机名。因此,请在配置文件中使用主机名。
此问题可能是由于缺少HADOOP_CONF_DIR,MapReduce应用程序需要该HADOOP_CONF_DIR才能连接到yarn site.xml中提到的资源管理器。因此,在运行MapReduce作业之前,请尝试使用适当的HADOOP CONF目录(如export HADOOP_CONF_DIR=/etc/HADOOP/CONF)手动设置/导出HADOOP_ONF_DIR。这种方式对我有效:)
我在Kubernetes上运行Hadoop实例时遇到了同样的问题。问题在于错误消息本身";尝试连接到资源管理器时出现连接错误";。
Ps:ResourceManager侦听端口8032(除非更改)
确保您在与ResourceManager相同的网络中运行MapReduce作业,因为它将侦听以下地址:
http://<RESOURCE_MANAGER_IP>:8032