如何在hadoop中将系统属性传递给map函数

tct7dpnv  于 2021-06-03  发布在  Hadoop
关注(0)|答案(1)|浏览(355)

在hadoop map reduce框架中,有没有一种方法可以传递系统参数(比如-dmy_param=)来Map函数。作业提交到hadoop集群是通过.setjarbyclass()完成的。在mapper中,我必须创建配置,所以我想使其可配置,所以我认为通过属性文件的标准方式是可以的。只是在属性设置的地方传递参数。另一种方法是将属性文件添加到提交的jar中。有没有人有过解决问题的经验?

iswrvxsc

iswrvxsc1#

如果您还没有在作业中使用它,可以尝试genericoptionsparser、tool和toolrunner来运行hadoop作业。
注意:mydriver扩展了configured和implements工具。为了执行你的任务,用这个

hadoop -jar somename.jar MyDriver -D your.property=value arg1 arg2

有关详细信息,请查看此链接。
下面是我为您准备的一些示例代码:

public class MyDriver extends Configured implements Tool {

  public static class MyDriverMapper extends Mapper<LongWritable, Text, LongWritable, NullWritable> {

    protected void map(LongWritable key, Text value, Context context)
      throws IOException, InterruptedException {
      // In the mapper you can retrieve any configuration you've set
      // while starting the job from the terminal as shown below

      Configuration conf = context.getConfiguration();
      String yourPropertyValue = conf.get("your.property");
    }
  }

  public static class MyDriverReducer extends Reducer<LongWritable, NullWritable, LongWritable, NullWritable> {

    protected void reduce(LongWritable key, Iterable<NullWritable> values, Context context) 
      throws IOException, InterruptedException {
      // --- some code ---
    }
  }

  public static void main(String[] args) throws Exception {
    int exitCode = ToolRunner.run(new MyDriver(), args);
    System.exit(exitCode);
  }

  @Override
  public int run(String[] args) throws Exception {
    Configuration conf = getConf();
    // if you want you can get/set to conf here too.
    // your.property can also be file location and after
    // you retrieve the properties and set them one by one to conf object.

    // --other code--//
    Job job = new Job(conf, "My Sample Job");
    // --- other code ---//
    return (job.waitForCompletion(true) ? 0 : 1);
  }
}

相关问题