我的测试环境
我正在尝试将基于 3 个节点的Hadoop Cluster
部署到我的测试环境中:
- 1 名称节点(主节点:172.30.10.64)
- 2 个数据节点(slave1 : 172.30.10.72 和 slave 2 : 172.30.10.62)
我将主属性的文件配置到我的 namenode 中,将具有从属属性的文件配置到我的 datananodes 中。
主文件的文件
主机:
127.0.0.1 localhost
172.30.10.64 master
172.30.10.62 slave2
172.30.10.72 slave1
# The following lines are desirable for IPv6 capable hosts
::1 localhost ip6-localhost ip6-loopback
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
HDFS-site.xml :
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop_tmp/hdfs/namenode</value>
</property>
</configuration>
核心站点.xml :
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
</property>
</configuration>
纱线站点.xml :
<configuration>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master:8025</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master:8030</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>master:8050</value>
</property>
</configuration>
地图网站.xml :
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>master:10020</value>
</property>
</configuration>
我有奴隶文件:
slave1
slave2
母版文件 :
master
从属文件:
我只添加了针对主文件更改的文件。
HDFS-site.xml :
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop_tmp/hdfs/datanode</value>
</property>
</configuration>
我的问题
我从/usr/local/hadoop/sbin
启动:
./start-dfs.sh &&./start-yarn.sh
这就是我得到的:
hduser@master:/usr/local/hadoop/sbin$ ./start-dfs.sh && ./start-yarn.sh
18/03/14 10:45:50 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [master]
hduser@master's password:
master: starting namenode, logging to /usr/local/hadoop-2.7.5/logs/hadoop-hduser-namenode-master.out
hduser@slave2's password: hduser@slave1's password:
slave2: starting datanode, logging to /usr/local/hadoop-2.7.5/logs/hadoop-hduser-datanode-slave2.out
所以我从我的slave2打开了日志文件:
2018-03-14 10:46:05,494 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/172.30.10.64:9000. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECOND$
2018-03-14 10:46:06,495 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/172.30.10.64:9000. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECOND$
2018-03-14 10:46:07,496 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/172.30.10.64:9000. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECOND$
我做了什么
我尝试了一些东西,但到目前为止都没有效果:
- 从主站到从站之间以及从站之间的ping工作正常 从
- 主站到从站以及从站之间的ssh 工作正常
- 在我的主节点中
hdfs namenode -format
- 重新创建名称节点和 Datanaode 文件夹
- 在我的主虚拟机中打开端口 9000
- 防火墙已禁用:
sudo ufw status
-->已禁用
我有点迷茫,因为一切似乎都很好,我不知道为什么我不克服启动我的 hadoop 集群。
我可能会找到答案:
我从主节点重新生成 ssh 密钥,然后复制到从属节点。它现在似乎有效。
#Generate a ssh key for hduser
$ ssh-keygen -t rsa -P ""
#Authorize the key to enable password less ssh
$ cat /home/hduser/.ssh/id_rsa.pub >> /home/hduser/.ssh/authorized_keys
$ chmod 600 authorized_keys
#Copy this key to slave1 to enable password less ssh and slave2 too
$ ssh-copy-id -i ~/.ssh/id_rsa.pub slave1
$ ssh-copy-id -i ~/.ssh/id_rsa.pub slave2