如何在reducer中维护mapwritable的顺序

我的Mapper实现

public class SimpleMapper extends Mapper<Text, Text, Text, MapWritable> {
@Override
protected void map(Text key, Text value,Context context)
        throws IOException, InterruptedException {
            MapWritable writable = new LinkedMapWritable();
            writable.put("unique_key","one");
            writable.put("another_key","two");
            context.write(new Text("key"),writable );
        }

}

而Reducer的实现是:

public class SimpleReducer extends Reducer<Text, MapWritable, NullWritable, Text> {
@Override
protected void reduce(Text key, Iterable<MapWritable> values,Context context)
        throws IOException, InterruptedException {
            // The map writables have to be ordered based on the "unique_key" inserted into it
        }

}

我必须使用二次排序吗?还有别的办法吗?

reducer中的MapWritable(值)总是以不可预测的顺序排列，这个顺序可能因运行而异，并且您无法控制它。

但是Map/Reduce范式保证的是，呈现给reducer的键是有序的，并且属于一个键的所有值将被分配给单个reducer。

因此，您绝对可以为您的用例使用二级排序和自定义分区器。

相关内容

最新更新

热门标签：