我已经使用Hortonworks数据平台安装了Hadoop。我有三台运行CentOS 7的机器。这三台计算机中的一台正在运行amabari服务器和ambari客户端实例。另外两个只运行amabari客户端。
所有安装过程都很顺利,直到NameNode启动任务引发错误。NameNode运行在amabari服务器的同一台机器上。
这是错误日志
Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py", line 401, in <module>
NameNode().execute()
File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 219, in execute
method(env)
File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py", line 102, in start
namenode(action="start", hdfs_binary=hdfs_binary, upgrade_type=upgrade_type, env=env)
File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89, in thunk
return fn(*args, **kwargs)
File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py", line 146, in namenode
create_log_dir=True
File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/utils.py", line 267, in service
Execute(daemon_cmd, not_if=process_id_exists_command, environment=hadoop_env_exports)
File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 154, in __init__
self.env.run()
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 158, in run
self.run_action(resource, action)
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 121, in run_action
provider_action()
File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 238, in action_run
tries=self.resource.tries, try_sleep=self.resource.try_sleep)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 70, in inner
result = function(command, **kwargs)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 92, in checked_call
tries=tries, try_sleep=try_sleep)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 140, in _call_wrapper
result = _call(command, **kwargs_copy)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 291, in _call
raise Fail(err_msg)
resource_management.core.exceptions.Fail: Execution of 'ambari-sudo.sh su hdfs -l -s /bin/bash -c 'ulimit -c unlimited ; /usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh --config /usr/hdp/current/hadoop-client/conf start namenode'' returned 1. starting namenode, logging to /var/log/hadoop/hdfs/hadoop-hdfs-namenode-hadoop.out
上面的日志显示:
resource_management.core.exceptions.Fail:执行"ambari-sudo.sh su hdfs-l-s/bin/bash-c"ulimit-c无限制/usr/hdp/current/hadop-client/sbin/hadop-daemon.sh--config/usr/hdp/current/Hadop-client/conf启动名称节点"返回1。启动namenode,记录到/var/log/hoop/hdfs/hadoop-hdfs-namenode-hadoop.out
但当我打开hadoop-hdfs-namenode-hadoop.out文件时,内容是:
ulimit -a for user hdfs
core file size (blocks, -c) unlimited
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 30513
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 128000
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 65536
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
我为用户hdfs设置了更大的软限制和硬限制,但它不起作用。我已经格式化了namenode,但它也不起作用。所以我试着重新安装服务器和客户端,但仍然无法正常工作。
谢谢你的建议。
拔出一些头发后,我找到了一个解决方法,但还不一定了解原因。这似乎与DNS有关。当我将主机名添加到主机文件中时,它解决了问题,而不是依赖于当前主机的DNS。例如
172.16.1.34 hostname.domain hostname
这很奇怪,因为DNS对主机运行得很好。我在一个代理人后面工作。