AWS EMR步骤失败，因为它创建的作业失败

我正在尝试使用亚马逊电子病历分析维基百科文章视图数据集。此数据集包含三个月期间（2011年1月1日至2011年3月31日）的页面视图统计信息。我正在努力寻找在那段时间里观点最多的那篇文章。这是我正在使用的代码：

public class mostViews {
public static class Map extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> {
    private final static IntWritable views = new IntWritable(1);
    private Text article = new Text();
    public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException {
        String line = value.toString();
        String[] words = line.split(" ");
        article.set(words[1]);
        views.set(Integer.parseInt(words[2]));
        output.collect(article, views);
    }
}
public static class Reduce extends MapReduceBase implements Reducer<Text, IntWritable, Text, IntWritable> {
    public void reduce(Text key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException {
        int sum = 0;
        while (values.hasNext()) 
        {
            sum += values.next().get();
        }
        output.collect(key, new IntWritable(sum));
    }
}
public static void main(String[] args) throws Exception {
    JobConf conf = new JobConf(mostViews.class);
    conf.setJobName("wordcount");
    conf.setOutputKeyClass(Text.class);
    conf.setOutputValueClass(IntWritable.class);
    conf.setMapperClass(Map.class);
    conf.setCombinerClass(Reduce.class);
    conf.setReducerClass(Reduce.class);
    conf.setInputFormat(TextInputFormat.class);
    conf.setOutputFormat(TextOutputFormat.class);
    FileInputFormat.setInputPaths(conf, new Path(args[0]));
    FileOutputFormat.setOutputPath(conf, new Path(args[1]));
    JobClient.runJob(conf);
}
}

代码本身是有效的，但当我创建一个集群并添加一个自定义jar时，它有时会失败，但有时会起作用。使用整个数据集作为输入会导致失败，但使用一个月（例如1月）会完成。在使用整个数据集运行后，我查看了"控制器"日志文件，发现了以下内容，我认为这是相关的：

2015-03-10T11:50:12.437Z INFO Synchronously wait child process to complete :     hadoop jar /mnt/var/lib/hadoop/steps/s-22ZUAWNM...
2015-03-10T12:05:10.505Z INFO Process still running
2015-03-10T12:20:12.573Z INFO Process still running
2015-03-10T12:35:14.642Z INFO Process still running
2015-03-10T12:50:16.711Z INFO Process still running
2015-03-10T13:05:18.779Z INFO Process still running
2015-03-10T13:20:20.848Z INFO Process still running
2015-03-10T13:35:22.916Z INFO Process still running
2015-03-10T13:50:24.986Z INFO Process still running
2015-03-10T14:05:27.056Z INFO Process still running
2015-03-10T14:20:29.126Z INFO Process still running
2015-03-10T14:35:31.196Z INFO Process still running
2015-03-10T14:50:33.266Z INFO Process still running
2015-03-10T15:05:35.337Z INFO Process still running
2015-03-10T15:11:37.366Z INFO waitProcessCompletion ended with exit code 1 :     hadoop jar /mnt/var/lib/hadoop/steps/s-22ZUAWNM...
2015-03-10T15:11:40.064Z INFO Step created jobs: job_1425988140328_0001
2015-03-10T15:11:50.072Z WARN Step failed as jobs it created failed.     Ids:job_1425988140328_0001

有人能告诉我出了什么问题吗？我能做些什么来解决它吗？事实上，它可以工作一个月，但不能工作两三个月，这让我认为数据集可能太大了，但我不确定。我对Hadoop/EMR还是个新手，所以如果我遗漏了任何信息，请告诉我。如有任何帮助或建议，我们将不胜感激。

提前感谢！

这些错误通常发生在HDFS（EMR节点的硬盘）或内存空间不足时。

首先，我将尝试读取消息引导您访问的日志："/mnt/var/lib/hadop/steps/s-22ZUAWNM…"

其次，我会尝试创建一个更大的EMR（具有更多磁盘和RAM或更多核心实例的EC2实例）。

相关内容

最新更新

热门标签：