如何在hadoop分区器中定义数组

9rnv2umw 于 2021-06-02 发布在 Hadoop

关注(0)|答案(2)|浏览(331)

我是hadoop和mapreduce编程的新手，不知道该怎么做。我想在hadoop分区器中定义一个int数组。我想在main函数中感受这个数组，并在partitioner中使用它的内容。我试过使用 IntWritable 一排排，但没有一个不起作用。我试着用 IntArrayWritable 但还是没用。如果有人帮助我，我会很高兴的。非常感谢

public static IntWritable h = new IntWritable[1];

public static void main(String[] args) throws Exception {
    h[0] = new IntWritable(1);
}

public static class CaderPartitioner extends Partitioner <Text,IntWritable> {

    @Override
    public int getPartition(Text key, IntWritable value, int numReduceTasks) {
        return h[0].get();
    }
}

hadoop mapreduce Arrays

来源：https://stackoverflow.com/questions/39578797/how-to-define-an-array-in-hadoop-partitioner

2条答案

按热度按时间

9nvpjoqh1#

下面是分区器的重构版本。主要变化是：
删除了 main() 不需要，初始化应该在构造函数中完成
从类和成员变量中删除了静态变量

public class CaderPartitioner extends Partitioner<Text,IntWritable> {

    private IntWritable[] h;

    public CaderPartitioner() {
        h = new IntWritable[1];
        h[0] = new IntWritable(1);
    }

    @Override
    public int getPartition(Text key, IntWritable value, int numReduceTasks) {
        return h[0].get();
    }
}

笔记： h 不需要是可写的，除非问题中没有包含其他逻辑。
不清楚是什么原因造成的 h[] 是，你要配置它吗？在这种情况下，分割者可能需要 implement Configurable 所以你可以用 Configurable 对象以某种方式设置数组。

赞(0）回复(0）举报 2021-06-02

rsaldnfx2#

如果值的数量有限，可以按以下方式执行。在main方法中设置配置对象的值，如下所示。

Configuration conf = new Configuration();
    conf.setInt("key1", value1);
    conf.setInt("key2", value2);

然后为partitioner类实现可配置接口并获取配置对象，然后在partitioner中从中获取键/值

public class testPartitioner extends Partitioner<Text, IntWritable> implements Configurable{

Configuration config = null;

@Override
public int getPartition(Text arg0, IntWritable arg1, int arg2) {

    //get your values based on the keys in the partitioner
    int value = getConf().getInt("key");
    //do stuff on value

    return 0;
}

@Override
public Configuration getConf() {
    // TODO Auto-generated method stub
    return this.config;
}

@Override
public void setConf(Configuration configuration) {
    this.config = configuration;

 }  
}

支撑环https://cornercases.wordpress.com/2011/05/06/an-example-configurable-partitioner/
注意，如果文件中有大量的值，那么最好找到一种从分区器中的作业对象获取缓存文件的方法

赞(0）回复(0）举报 2021-06-02

我来回答

如何在hadoop分区器中定义数组

2条答案

相关问题

热门标签

最新问答