输入数据文件:
名称、月份、类别、支出
hitesh,1,A1,10020
hitesh,2,A2,10300
hitesh,3,A3,10400
hitesh,4,A4,11000
hitesh,5,A1,21000
hitesh,6,A2,5000
hitesh,7,A3,9000
hitesh,8,A4,1000
hitesh,9,A1,111000
hitesh,10,A2,12000
hitesh,11,A3,71000
hitesh,12,A4,177000
kuwar,1,A1,10700
kuwar,2,A2,17000
kuwar,3,A3,10070
kuwar,4,A4,10007
按个人计算的总支出和支出的唯一类别。(输出需要看起来像:名称、总支出、唯一类别的总数(
我试过的。。。。。我的代码
明智的总花费
public class Emp
{
public static class MyMap extends Mapper<LongWritable,Text,Text,IntWritable>
{
public void map(LongWritable k,Text v, Context con)
throws IOException, InterruptedException
{
String line = v.toString();
String[] w=line.split(",");
String person=w[0];
int exp=Integer.parseInt(w[3]);
con.write(new Text(person), new IntWritable(exp));
}
}
public static class MyRed extends Reducer<Text,IntWritable,Text,IntWritable>
{
public void reduce(Text k, Iterable<IntWritable> vlist, Context con)
throws IOException , InterruptedException
{
int tot =0;
for(IntWrit
able v:vlist)
tot+=v.get();
con.write(k,new IntWritable(tot));
}
}
public static void main(String[] args) throws Exception
{
Configuration c = new Configuration();
Job j= new Job(c,"person-wise");
j.setJarByClass(Emp.class);
j.setMapperClass(MyMap.class);
j.setReducerClass(MyRed.class);
j.setOutputKeyClass(Text.class);
j.setOutputValueClass(IntWritable.class);
Path p1 = new Path(args[0]);
Path p2 = new Path(args[1]);
FileInputFormat.addInputPath(j,p1);
FileOutputFormat.setOutputPath(j,p2);
System.exit(j.waitForCompletion(true) ? 0:1);
}
}
如何在这个程序中获得唯一类别的总数,以及如何使输出看起来像名称、总支出、唯一类别的总计数。。???
感谢
已经对代码进行了修改。希望这是有用的。
public class Emp
{
public static class MyMap extends Mapper<LongWritable,Text,Text,Text>
{
public void map(LongWritable k,Text v, Context con)
throws IOException, InterruptedException
{
String line = v.toString();
String[] w=line.split(",");
String person=w[0];
int exp=Integer.parseInt(w[3]);
con.write(new Text(person), new Text(line));
}
}
public static class MyRed extends Reducer<Text,Text,Text,Text>
{
public void reduce(Text k, Iterable<Text> vlist, Context con)
throws IOException , InterruptedException
{
int tot =0;
Set<String> cat = new HashSet<String>();
for(Text v:vlist){
String data = v.toString();
String[] dataArray = data.Split(",");
tot+ = Integer.parseInt((dataArray[3]); //calculating the total spend
cat.add(dataArray[2]);// finding the number of unique categories
}
con.write(k,new Text(tot.toString()+","+cat.size().toString()));// writing the name,total spend and total unique categories to the output
}
public static void main(String[] args) throws Exception
{
Configuration c = new Configuration();
Job j= new Job(c,"person-wise");
j.setJarByClass(Emp.class);
j.setMapperClass(MyMap.class);
j.setReducerClass(MyRed.class);
j.setOutputKeyClass(Text.class);
j.setOutputValueClass(IntWritable.class);
Path p1 = new Path(args[0]);
Path p2 = new Path(args[1]);
FileInputFormat.addInputPath(j,p1);
FileOutputFormat.setOutputPath(j,p2);
System.exit(j.waitForCompletion(true) ? 0:1);
}
}
您可以创建一对自定义可写的IntWritabe和Text,一个用于支出,另一个用于类别,并将其用作Map值。否则,在一个字符串中使用一些分隔符传递支出和类别,并在减速器侧进行拆分。
一旦你得到了一对具有相同for循环的总支出和for类别的组合,就可以将所有类别放入同一for循环内的Java集中,然后使用Set.size((获取唯一类别的数量,并将其打印在context.write中。同样,在打印reduce侧值时,你可以使用与传递map值相同的技术。
在Mapper端,使用字符串生成器附加类别和支出,并将其作为映射值传递。
StringBuilder sb = new StringBuilder();
String sep=":";
sb.append(w[2]);
sb.append(sep);
sb.append(w[3]);
con.write(new Text(person), new Text(sb.toString()));
在reduce方面,将值与地图方面使用的值进行拆分,并汇总支出,计算使用类别创建的集合的大小。代码没有经过测试,如果下面的代码中遗漏了变量,请强制转换。
public void reduce(Text k, Iterable<Text> vlist, Context con)
throws IOException , InterruptedException
{
int tot =0;
String myval;
Strng[] split_val;
Set<String> myset=new HashSet<String>();
int uniq_category;
StringBuilder sb1 = new StringBuilder();
for(Text v:vlist)
{
myval=v.toString();
split_val=myval.split(":");
myset.add(split_val[0]);
tot+=Integer.ParseInt(split_val[1]);
}
uniq_category=myset.size();
String sep=" ";
sb1.append(uniq_category);
sb1.append(sep);
sb1.append(tot);
con.write(k,new Text(sb1.toString()));
}
}
或者使用IntWritable和Text为map创建一对,并如上所述减少值。