MapReduce程序映射任务超时



我收到了这个奇怪的错误。我写了一个wordCount程序来计算一个单词在文件中重复的次数。

因此,当我在hadoop上运行MR代码时,代码会停留在"映射100%,减少0%"上。基本模式是第一个映射任务在600秒后超时,然后再次超时,作业会自行终止。

我检查了Job Tracker,由于Map任务没有完成,reduce任务无法启动,因此任务被卡住。

我已经尝试修复它两天了,在此期间,我删除了原始的虚拟Ubuntu Cloudera并重新安装了它,所以我们可以确定这不是配置问题。

感谢您的帮助。

以下是3个代码文件。

WordCount.java

public class WordCount extends Configured implements Tool {
@Override
public int run(String[] args) throws Exception {
    Configuration conf =  super.getConf();
    Job job=new Job(conf, "Word Count Job");
    job.setJarByClass(WordCount.class);
    job.setMapperClass(WordMapper.class);
    job.setReducerClass(WordReducer.class);
    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(FloatWritable.class);
    job.setInputFormatClass(TextInputFormat.class);
    job.setOutputFormatClass(TextOutputFormat.class);
    FileInputFormat.setInputPaths(job, new Path(args[0]));
    FileOutputFormat.setOutputPath(job, new Path(args[1]));
    job.waitForCompletion(Boolean.TRUE);
    return 0;
}
public static void main(String[] args) {
    //Display error message in case insufficient arguments supplied
    if(args.length<2){
        System.out.println("usage: WordCount <Input-Path> <Output-Path>");
    }
    Configuration conf=new Configuration(Boolean.TRUE);
    int i;
    try {
        //Run the overridden 'run' method code
        i = ToolRunner.run(conf, new WordCount(), args);
        //Print usage stats to out
        //ToolRunner.printGenericCommandUsage(System.out);
        //exit if job cannot start
        System.exit(i);
    } catch (Exception e) {
        e.printStackTrace();
        System.exit(-1);
    }
}
}

WordMapper.java

public class WordMapper extends Mapper<LongWritable, Text, Text, FloatWritable> {
@Override
protected void map(LongWritable key, 
        Text value,
        Mapper<LongWritable, Text, Text, FloatWritable>.Context context)
        throws IOException, InterruptedException {

    if(!value.toString().trim().isEmpty()){
        StringTokenizer valTokens = new StringTokenizer(value.toString()); 
        while(valTokens.hasMoreTokens()){
            context.write(new Text(valTokens.nextToken()), new FloatWritable(Float.parseFloat("1.00")));
        }
    }   
}
}

WordReducer.java

public class WordReducer extends Reducer<Text, FloatWritable, Text, FloatWritable> {
@Override
protected void reduce(Text key, Iterable<FloatWritable> values,
        Reducer<Text, FloatWritable, Text, FloatWritable>.Context context)
        throws IOException, InterruptedException {
    Iterator<FloatWritable> valsIter = values.iterator();
    int i = 0;
    while(valsIter.hasNext()) 
        i++;
    context.write(key, new FloatWritable((float)i));
}
}

您的问题在于这行代码:

    while(valsIter.hasNext()) 
    i++;

valsIter.hasNext检查迭代器中是否有下一个元素,但不移动指针的位置。因此,检查总是返回true。除非您调用valsIter.next().

相关内容

  • 没有找到相关文章

最新更新