如何在hadoop中以编程方式减少每个任务的执行时间？

von4xj4u 于 2021-05-30 发布在 Hadoop

关注(0)|答案(1)|浏览(377)

我在hadoop中运行一个简单的map reduce作业，在java中我可以使用 System.currentTimeInMillis() 函数，在mapreduce中如何为map（endtime starttime）、reduce（endtime starttime）完成此功能。我试着遵循代码。。我呢 set job.setNumReduceTasks(4) 编辑时间：

public void reduce(Text _key, Iterable<IntWritable> values, Context context)
            throws IOException, InterruptedException {
        // process values
        long start=System.currentTimeMillis();
        int sum=0;

        for (IntWritable val : values) {

            sum+=val.get();

        }
        result.set(sum);
        context.write(_key, result);
        long end=System.currentTimeMillis();

        System.out.println(" time Taken "+(end-start));

    }

但结果是：

time Taken 1
 time Taken 0
 time Taken 0
 time Taken 0
 time Taken 0
 time Taken 0
 time Taken 0
 time Taken 0
 time Taken 0
 time Taken 0
 ----------
 ----------

但我把任务数减少到了4。。这里显示了执行每个键值对所花费的时间。。
添加setup（）方法和cleanup（）方法后。。

public void run(Context context) throws IOException, InterruptedException {
        start=System.currentTimeMillis();
        setup(context);
        try {
          while (context.nextKey()) {
            reduce(context.getCurrentKey(), context.getValues(), context);
          }
        } finally {
          cleanup(context);
          end=System.currentTimeMillis();
          System.out.println(" End- Start : "+(end-start));
        }
      }

    public void reduce(Text _key, Iterable<IntWritable> values, Context context)
            throws IOException, InterruptedException {

        int sum=0;

        for (IntWritable val : values) {

            sum+=val.get();

        }
        result.set(sum);
        context.write(_key, result);

    }

我已经把减速机的数量设置为4 job.setNumReduceTasks(4) . 但它只显示一个时间戳。。我做错什么了吗。。。

Java hadoop

来源：https://stackoverflow.com/questions/28919295/how-do-i-get-each-reduce-tasks-execution-time-programmatically-in-hadoop