我正在8节点Hadoop集群上工作,我正在尝试使用指定的配置执行一个简单的流作业。
hadoop jar /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2-cdh3u0.jar -D mapred.map.max.tacker.failures=10 -D mared.map.max.attempts=8 -D mapred.skip.attempts.to.start.skipping=8 -D mapred.skip.map.max.skip.records=8 -D mapred.skip.mode.enabled=true -D mapred.max.map.failures.percent=5 -input /user/hdfs/ABC/ -output "/user/hdfs/output1/" -mapper "perl -e 'while (<>) { chomp; print; }; exit;" -reducer "perl -e 'while (<>) { ~s/LR>/LR>n/g; print ; }; exit;"
我使用cloudera的hadoop cdh30与hadoop 0.20.2的发行版。执行这个作业的问题是每次作业都失败。作业给出错误:
java.lang.Throwable: Child Error
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:242)
Caused by: java.io.IOException: Task process exit with nonzero status of 1.
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:229)
-------
java.lang.Throwable: Child Error
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:242)
Caused by: java.io.IOException: Task process exit with nonzero status of 1.
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:229)
STDERR on the datanodes:
Exception in thread "main" java.io.IOException: Exception reading file:/mnt/hdfs/06/local/taskTracker/hdfs/jobcache/job_201107141446_0001/jobToken
at org.apache.hadoop.security.Credentials.readTokenStorageFile(Credentials.java:146)
at org.apache.hadoop.mapreduce.security.TokenCache.loadTokens(TokenCache.java:159)
at org.apache.hadoop.mapred.Child.main(Child.java:107)
Caused by: java.io.FileNotFoundException: File file:/mnt/hdfs/06/local/taskTracker/hdfs/jobcache/job_201107141446_0001/jobToken does not exist.
对于错误的原因,我已经检查了以下事情,但它仍然崩溃,我无法理解原因。
1. All the temp directories are in place
2. Memory is way more than it might be required for job (running a small job)
3. Permissions verified.
4. Nothing Fancier done in the configuration just usual stuff.
最奇怪的是,job有时运行成功,但大多数时候都失败了。任何关于这些问题的指导/帮助都会很有帮助。我正在处理这个错误从过去的4天,我无法找出任何东西。请帮助! !
谢谢,问候,Atul
我遇到过同样的问题,如果任务跟踪器不能为任务分配指定的内存给子JVM,就会发生这种情况。
尝试再次执行相同的作业,当集群不忙于运行许多其他作业时,它将通过或有推测执行为true,在这种情况下hadoop将在另一个任务跟踪器中执行相同的任务。