我正在尝试通过链接在Linux(Ubuntu虚拟机)中减少Hadoop映射
我在一个示例文件上运行了wordcount示例。进程意外终止。如何调试?
起初,我在大数据集上遇到了内存不足的错误。
15/11/28 19:24:27 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
15/11/28 19:24:27 INFO mapred.MapTask: Processing split: hdfs://localhost:54310/user/hduser/eg2/a.txt:0+1538
Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x00000000e6093000, 104861696, 0) failed; error='Cannot allocate memory' (errno=12)
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (malloc) failed to allocate 104861696 bytes for committing reserved memory.
# An error report file with more information is saved as:
# /usr/local/hadoop/hs_err_pid7516.log
所以我缩小了文件的大小,然后再试了一次,结果意外终止。
hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar wordcount /user/hduser/eg2/ /user/hduser/eg2/eg2-output2
......
......
15/11/28 18:55:44 INFO mapred.LocalJobRunner: Waiting for map tasks
15/11/28 18:55:44 INFO mapred.LocalJobRunner: Starting task: attempt_local1996683170_0001_m_000000_0
15/11/28 18:55:44 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
15/11/28 18:55:44 INFO mapred.MapTask: Processing split: hdfs://localhost:54310/user/hduser/eg2/a.txt:0+1538
15/11/28 18:55:45 INFO mapreduce.Job: Job job_local1996683170_0001 running in uber mode : false
15/11/28 18:55:45 INFO mapreduce.Job: map 0% reduce 0%
Killed
为什么流程被终止?
尝试:
Hadoop job -list
杀死所有作业并重新运行:
Hadoop job –kill <JobID>
尝试检查作业跟踪器的日志中的错误
http://localhost:50070/ – web UI of the NameNode daemon
http://localhost:50030/ – web UI of the JobTracker daemon
http://localhost:50060/ – web UI of the TaskTracker daemon
数据集的大小无关紧要。Hadoop没有足够的内存启动。我尝试增加虚拟机的内存,问题得到了解决。