我最近开始使用Hadoop,一开始就出现了一些问题,到目前为止我已经能够解决它们,但有一个问题我无法克服。事情是,一切似乎都工作得很好,但是当我试图启动一个Hadoop作业时,它只是挂起,我不知道如何设法让这个工作,执行框架如下:
13/05/22 20:02:43 INFO support.ClassPathXmlApplicationContext: Refreshing org.springframework.context.support.ClassPathXmlApplicationContext@3fe9029b: startup date [Wed May 22 20:02:43 CEST 2013]; root of context hierarchy
13/05/22 20:02:43 INFO xml.XmlBeanDefinitionReader: Loading XML bean definitions from class path resource [hadoop-configuration.xml]
13/05/22 20:02:43 INFO config.PropertyPlaceholderConfigurer: Loading properties file from class path resource [hadoop.properties]
13/05/22 20:02:43 INFO support.DefaultListableBeanFactory: Pre-instantiating singletons in org.springframework.beans.factory.support.DefaultListableBeanFactory@2d062bb6: defining beans [org.springframework.beans.factory.config.PropertyPlaceholderConfigurer#0,hadoopConfiguration,foundation-job,JulianSchJobRunner]; root of factory hierarchy
13/05/22 20:02:44 INFO config.PropertiesFactoryBean: Loading properties file from class path resource [hadoop.properties]
13/05/22 20:02:44 INFO mapreduce.JobRunner: Starting job [foundation-job]
我还必须说我正在使用Cloudera的CDH4和Hadoop的Spring。
正如我在最后一行所说的,它停止并且不继续执行。提前感谢各位
好的,我一直在查看日志,它启动了一些异常:
2013-05-22 21:01:36,254 WARN org.apache.hadoop.mapred.JobTracker: Writing to file hdfs://localhost.localdomain:8020/tmp/mapred/system/jobtracker.info failed!
2013-05-22 21:01:36,254 WARN org.apache.hadoop.mapred.JobTracker: FileSystem is not ready yet!
2013-05-22 21:01:36,262 WARN org.apache.hadoop.mapred.JobTracker: Failed to initialize recovery manager.
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /tmp/mapred/system/jobtracker.info could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and no node(s) are excluded in this operation.
这个问题可能与HDFS的权限有关,或者他们与它无关?
好的,解决了,对于任何未来的查找,我不得不说,问题是通过更改这些目录文件路径来解决的,到分配更多内存的挂载点(问题是NN内存不足):
dfs.name.dir=${HOME}/path-to-desired-location instead of the basepath stated by default:
dfs.name.dir=/dfs/nn
我也必须在datanode和secondarynamenode上做同样的事情,然后只需要格式化namenode并重新启动HDFS服务:
hdfs namenode -format
希望对大家有所帮助,谢谢