如何将自定义参数传递给hadoop mapreduce作业?
例如,如果我的驱动程序中有:
public static void main(String[] args) throws Exception {
try {
String one = args[0];
String two = args[1];
System.out.println(two);
System.out.println(one);
}
catch (ArrayIndexOutOfBoundsException e){
System.out.println("ArrayIndexOutOfBoundsException caught");
}
finally {
}
Configuration conf = new Configuration();
Job job = Job.getInstance(conf, "word count");
job.setJarByClass(WordCount.class);
job.setMapperClass(TokenizerMapper.class);
job.setCombinerClass(IntSumReducer.class);
job.setReducerClass(IntSumReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job, new Path(args[2]));
FileOutputFormat.setOutputPath(job, new Path(args[3]));
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
在jar文件之后,当我运行命令时:
hadoop jar str1 str2 /home/bli1/wordcount/wc.jar /user/bli1/wordcount/input /user/bli1/wordcount/testout
我得到:
Not a valid JAR: /nfsdata/DSCluster/home/bli1/wordcount/str1
1条答案
按热度按时间3zwtqj6y1#
参数需要在jar文件引用之后,例如: