我是地图减少获取NoSuchElementException的新手,请帮忙。
文本下方的输入文件容器:
this is a hadoop program
i am writing it for first time
映射器类:
public class Mappers extends MapReduceBase implements Mapper<LongWritable, Text, IntWritable, IntWritable>{
private Text word = new Text();
private IntWritable singleWordCount = new IntWritable();
private IntWritable one = new IntWritable(1);
@Override
public void map(LongWritable key, Text value, OutputCollector<IntWritable, IntWritable> output, Reporter reporter) throws IOException {
StringTokenizer wordList = new StringTokenizer(value.toString());
while (wordList.hasMoreTokens()) {
int wordSize = wordList.nextToken().length();
singleWordCount.set(wordSize);
if(word != null && wordList != null && wordList.nextToken() != null){
word.set(wordList.nextToken());
output.collect(singleWordCount, one);
}
}
}
}
这是我收到的错误
每次迭代都会在循环中调用wordList.nextToken()
三次。每次调用它时StringTokenizer
都会返回下一个标记,这将在程序命中文本中的单词first
时导致异常,因为您检索first
然后time
然后尝试检索下一个不存在的单词,从而导致异常。
您需要做的是在每次迭代中检索一次并将其存储在变量中。或者,如果您确实需要在一次迭代中检索两个单词,请务必调用hasMoreTokens()
以检查是否真的有另一个单词需要处理,然后再实际调用nextToken()
。