如何在reducer中维护mapwritables的顺序？

6kkfgxo0 于 2021-06-03 发布在 Hadoop

关注(0)|答案(1)|浏览(336)

我的Map器实现

public class SimpleMapper extends Mapper<Text, Text, Text, MapWritable> {

@Override
protected void map(Text key, Text value,Context context)
        throws IOException, InterruptedException {

            MapWritable writable = new LinkedMapWritable();
            writable.put("unique_key","one");
            writable.put("another_key","two");
            context.write(new Text("key"),writable );
        }

}
其实现是：

public class SimpleReducer extends Reducer<Text, MapWritable, NullWritable, Text> {
@Override
protected void reduce(Text key, Iterable<MapWritable> values,Context context)
        throws IOException, InterruptedException {

            // The map writables have to be ordered based on the "unique_key" inserted into it
        }

}
我必须使用二次排序吗？还有别的办法吗？

hadoop mapreduce writable

来源：https://stackoverflow.com/questions/24316641/how-to-maintain-the-ordering-of-mapwritables-in-the-reducer

1条答案

按热度按时间

hvvq6cgz1#

reducer中的mapwritable（值）总是以不可预测的顺序排列，这个顺序可能因运行而异，您无法控制它。
但是map/reduce范式所保证的是，呈现给reducer的键将按排序顺序排列，并且属于单个键的所有值都将归属于单个reducer。
因此，您完全可以为您的用例使用二级排序和自定义分区器。

赞(0）回复(0）举报 2021-06-04

我来回答

如何在reducer中维护mapwritables的顺序？

1条答案

相关问题

热门标签

最新问答