Below is the data
c1 p1 q1 d1
c2 p2 q2 d2
需要输出-如果客户购买了p1,它应该给flag as 1
,否则它应该给flag 0
。有数百万的客户和数百万的产品,以下是所需的产量。如果有任何帮助,我将不胜感激。
c1 p1 q1 d1 1
c1 p2 q1 d1 0
c2 p2 q2 d2 1
c2 p1 q2 d2 0
你可以用一个mapside逻辑实现它,示例代码供参考:
public void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException {
String line = value.toString();
NullWritable value = NullWritable.get();
String tokens[] = line.split("<your delim>");
if (tokens[1] == "p1") {
line = line + "<your delim>" + "1";
} else if (tokens[1] == "p2") {
line = line + "<your delim>" + "0";
}
context.write(newText(line), value);
}