hadoop reducer中的多个for-each循环

qacovj5a 于 2021-06-03 发布在 Hadoop

关注(0)|答案(2)|浏览(337)

我在hadoop中遇到了多个for-each循环的问题，这有可能吗？
我现在为reducer类编写了什么代码：

public class R_PreprocessAllSMS extends Reducer<Text, Text, Text, Text>{
private final static Text KEY = new Text();
private final static Text VALUE = new Text();

    @Override
    public void reduce(Text key, Iterable<Text> values, Context context) throws IOException, InterruptedException {
        int sum = 0;
        for (Text value : values) {
            String[] splitString = value.toString().split("\t");
            sum += Integer.parseInt(splitString[1]);
        }
        if (sum > 100) {
            for (Text value : values) {
                String[] splitString = value.toString().split("\t");
                System.out.println(key.toString() + splitString[0] + " " + splitString[1]);
                KEY.set(key);
                VALUE.set(splitString[0] + "\t" + splitString[1]);
                context.write(KEY, VALUE);
            }
        }
    }
}

但我想有一种可能性，第二次搜索给定的值，并发出我们需要的值。如果不可能，您建议在hadoop中使用什么方法来实现这一点？谢谢。

Java hadoop mapreduce

来源：https://stackoverflow.com/questions/21309177/multiple-for-each-loops-in-hadoop-reducer

2条答案

按热度按时间

eqqqjvef1#

不需要循环两次，您可以延迟写入值，直到知道总和足够高，例如：

int sum = 0;
    List list = new ArrayList<String>();
    KEY.set(key);

    for (Text value : values) {
        String[] splitString = value.toString().split("\t");
        String line = splitString[0] + "\t" + splitString[1];

        sum += Integer.parseInt(splitString[1]);

        if (sum < 100) {
            list.add(line);
        } else {
            if (!list.isEmpty()) {
                for (String val: list) {
                   VALUE.set(val);
                   context.write(KEY, VALUE);
                }
                list.clear();
            }
            VALUE.set(line);
            context.write(KEY, VALUE);
        }
    }

赞(0）回复(0）举报 2021-06-03

m3eecexj2#

也许用两对mappres和减速器？你可以一个接一个地给他们打电话。例如，在一个主要领域创造两个就业机会。第二个得到第一个的结果。

JobConf jobConf1 = new JobConf();  
JobConf jobConf2 = new JobConf();  

Job job1 = new Job(jobConf1);  

Job job2 = new Job(jobConf2);

或者你可以看着that:httphttp://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapred/lib/chainreducer.html

赞(0）回复(0）举报 2021-06-03

我来回答

hadoop reducer中的多个for-each循环

2条答案

相关问题

热门标签

最新问答