我的Map器实现
public class SimpleMapper extends Mapper<Text, Text, Text, MapWritable> {
@Override
protected void map(Text key, Text value,Context context)
throws IOException, InterruptedException {
MapWritable writable = new LinkedMapWritable();
writable.put("unique_key","one");
writable.put("another_key","two");
context.write(new Text("key"),writable );
}
}
其实现是:
public class SimpleReducer extends Reducer<Text, MapWritable, NullWritable, Text> {
@Override
protected void reduce(Text key, Iterable<MapWritable> values,Context context)
throws IOException, InterruptedException {
// The map writables have to be ordered based on the "unique_key" inserted into it
}
}
我必须使用二次排序吗?还有别的办法吗?
1条答案
按热度按时间hvvq6cgz1#
reducer中的mapwritable(值)总是以不可预测的顺序排列,这个顺序可能因运行而异,您无法控制它。
但是map/reduce范式所保证的是,呈现给reducer的键将按排序顺序排列,并且属于单个键的所有值都将归属于单个reducer。
因此,您完全可以为您的用例使用二级排序和自定义分区器。