这是我的输入文件(custs.txt):
1002|surender|23
1003|Rahja|24
这是我的程序:
Main:
public class ReduceSideJoinMain {
/**
* @param args
* @throws IOException
* @throws ClassNotFoundException
* @throws InterruptedException
*/
public static void main(String[] args) throws IOException, InterruptedException, ClassNotFoundException {
Configuration conf = new Configuration();
JobConf config = new JobConf();
config.setQueueName("omega");
Job job = new Job(config,"word count");
job.setJarByClass(ReduceSideJoinMain.class);
Path inputFilePath1 = new Path(args[0]);
Path outputFilePath2 = new Path(args[1]);
//MultipleInputs.addInputPath(job, inputFilePath1, TextInputFormat.class,CustMapper.class);
//MultipleInputs.addInputPath(job, inputFilePath2, TextInputFormat.class,TxnsMapper.class);
FileInputFormat.addInputPath(job, inputFilePath1);
FileOutputFormat.setOutputPath(job, outputFilePath2);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
job.setMapperClass(CustMapper.class);
//job.setReducerClass(ReduceJoinMapper.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(Text.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}
映射器:
public class CustMapper extends Mapper<LongWritable,Text,Text,Text>
{
public static IntWritable one = new IntWritable(1);
protected void map(LongWritable key, Text value, Context context) throws java.io.IOException,java.lang.InterruptedException
{
String line = value.toString();
String arr[]= line.split("|");
context.write(new Text(arr[0]), new Text(arr[1]));
}
}
我得到以下输出,这是错误的:
1
1
我预计输出为:
1002 surender
1003 Rahja
为什么它没有给出预期的输出?拆分方法有问题吗?
使用String arr[] = line.split("\|");