在map reduce中迭代两次

w8ntj3qf  于 2021-06-01  发布在  Hadoop
关注(0)|答案(1)|浏览(340)

我写了一份工作,其中我的关键和价值是复合的。我需要在值中迭代两次,然后尝试缓存这些值,但是相同的值会重复出现。请帮帮我。
下面是我的减速机课。

public static class Reducerclass  extends Reducer<Text,Text,Text,Text> {
            DateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd hh:mm:ss a");

            private MultipleOutputs<Text, Text> multipleOutputs;

            @Override
            public void setup(Context context){
                multipleOutputs = new MultipleOutputs<Text, Text>(context);
            }
            public void reduce(Text rkey, Iterable<Text> rvalue, Context context) throws IOException, InterruptedException {             
                ArrayList<Text> ArrayList  = new ArrayList<Text>();
                Iterator<Text> iterator = rvalue.iterator();

                while (iterator.hasNext()) {
                    Text writable = iterator.next();
                    System.out.println("first iteration: " + writable);
                    ArrayList.add(new Text(writable));
context.write(new Text(rkey + ", "),new Text(writable + "--> first iteration"));
                }

                 int size = ArrayList.size();
                    for (int i = 0; i < size; ++i) {
                        System.out.println("second iteration: " + ArrayList.get(i));
context.write(new Text(rkey + ", "),new Text(ArrayList.get(i) + "--> Second iteration--->" + "Array Size -->" + size));
                    }

            }

        }

输入文件:

1509075052824 13.0619798 80.1468367
1509075112825 13.07537311 80.19612851
1509073985114 13.0507832 80.25069245
1509075072824 12.91690859 80.06168244

预期产量:

first iteration: 1509075052824 13.0619798 80.1468367
first iteration: 1509075112825 13.07537311 80.19612851
first iteration: 1509073985114 13.0507832 80.25069245
first iteration: 1509075072824 12.91690859 80.06168244

second iteration: 1509075052824 13.0619798 80.1468367
second iteration: 1509075112825 13.07537311 80.19612851
second iteration: 1509073985114 13.0507832 80.25069245
second iteration: 1509075072824 12.91690859 80.06168244

电流输出:

1509075042823 12.91877675 80.0466234--> first iteration
1509075042823 12.91877675 80.0466234--> Second iteration--->Array Size -->1
1509074972821 12.91738175 80.05294765--> first iteration
1509074972821 12.91738175 80.05294765--> Second iteration--->Array Size -->1
1509073795109 13.05561879 80.11920979--> first iteration
1509073795109 13.05561879 80.11920979--> Second iteration--->Array Size -->1
1509075132826 12.97988349 80.16310309--> first iteration
1509075132826 12.97988349 80.16310309--> Second iteration--->Array Size -->1
1509073885111 13.06640175 80.2457003--> first iteration
1509073885111 13.06640175 80.2457003--> Second iteration--->Array Size -->1

提前谢谢!

njthzxwz

njthzxwz1#

如果要将所有的reducer收集到一个arraylist中,则需要一个reducer。
为了得到它,你需要你的Map器总是输出相同的结果 rkey

相关问题