我正在编写一个mapreduce应用程序,它以(键,值(格式接受输入,并且只显示与reducer输出相同的数据。
这是示例输入:
1500s 1
1960s 1
Aldus 1
在下面的代码中,我使用 <<>> 指定输入格式,并将分隔符指定为 main(( 中的选项卡。当我运行代码时,我遇到了错误消息:
java.lang.Exception: java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to org.apache.hadoop.io.LongWritable
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to org.apache.hadoop.io.LongWritable
at cscie63.examples.WordDesc$KVMapper.map(WordDesc.java:1)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
尝试了不同的东西来调试,但没有任何帮助。
public class WordDesc {
public static class KVMapper
extends Mapper<Text, LongWritable, Text, LongWritable>{
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map(Text key, LongWritable value , Context context
) throws IOException, InterruptedException {
context.write(key,value);
}
}
public static class KVReducer
extends Reducer<Text,LongWritable,Text,LongWritable> {
private IntWritable result = new IntWritable();
public void reduce(Text key, LongWritable value,
Context context
) throws IOException, InterruptedException {
context.write(key, value);
}
}
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
conf.set("mapreduce.input.keyvaluelinerecordreader.key.value.separator", "t");
String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();
if (otherArgs.length < 2) {
System.err.println("Usage: wordcount <in> [<in>...] <out>");
System.exit(2);
}
Job job = new Job(conf, "word desc");
job.setInputFormatClass(KeyValueTextInputFormat.class);
job.setJarByClass(WordDesc.class);
job.setMapperClass(KVMapper.class);
job.setCombinerClass(KVReducer.class);
job.setReducerClass(KVReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(LongWritable.class);
for (int i = 0; i < otherArgs.length - 1; ++i) {
FileInputFormat.addInputPath(job, new Path(otherArgs[i]));
}
FileOutputFormat.setOutputPath(job,
new Path(otherArgs[otherArgs.length - 1]));
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}
我猜这一行job.setInputFormatClass(KeyValueTextInputFormat.class);
告诉你的程序将您的输入视为Text
的键值对。因此,当您要求输入值为LongWritable
时,您将获得此异常。
一个快速的解决方法是将您的输入读取为文本,然后,如果您想使用 LongWwriteable,请使用以下命令解析它:
public static class KVMapper
extends Mapper<Text, Text, Text, LongWritable>{
private final static LongWritable val = new LongWritable();
public void map(Text key, Text value, Context context) {
val.set(Long.parseLong(value.toString()));
context.write(key,val);
}
}
它的作用如下:value 是 Text,然后value.toString()
给出此文本的字符串表示形式,然后Long.parseLong()
解析此字符串。最后,val.set()
,将其转换为 LongWwriteable。
顺便说一句,我认为您不需要减速器...您可以通过将减少任务的数量设置为 0 来使其更快。