hadoop中不带reducer的jar运行wordcount

svdrlsy4  于 2021-05-30  发布在  Hadoop
关注(0)|答案(1)|浏览(343)

我已经安装了hadoop集群环境(master和slave)。工作顺利。
我使用(hadoop.example.jar)文件尝试了wordcount和grep,效果也不错。
现在,我想编辑(hadoop.example.jar)只运行mapper而不运行reducer。有办法吗???
我读过一些文章说我必须将setnumreducertask(0)的值设置为0,但我不知道如何设置?使用(hadoop.example.jar)文件。

5lhxktic

5lhxktic1#

不能更改hadoop.example.jar文件。
您需要创建自己的自定义代码并将其导出为jar文件。
修改后的字数代码应为:

  1. package org.myorg;
  2. import java.io.IOException;
  3. import java.util.*;
  4. import org.apache.hadoop.fs.Path;
  5. import org.apache.hadoop.conf.*;
  6. import org.apache.hadoop.io.*;
  7. import org.apache.hadoop.mapreduce.*;
  8. import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
  9. import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
  10. import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
  11. import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
  12. public class WordCount {
  13. public static class Map extends Mapper<LongWritable, Text, Text, IntWritable {
  14. private final static IntWritable one = new IntWritable(1);
  15. private Text word = new Text();
  16. public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
  17. String line = value.toString();
  18. StringTokenizer tokenizer = new StringTokenizer(line);
  19. while (tokenizer.hasMoreTokens()) {
  20. word.set(tokenizer.nextToken());
  21. context.write(word, one);
  22. }
  23. }
  24. }
  25. public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable> {
  26. public void reduce(Text key, Iterable<IntWritable> values, Context context)
  27. throws IOException, InterruptedException {
  28. int sum = 0;
  29. for (IntWritable val : values) {
  30. sum += val.get();
  31. }
  32. context.write(key, new IntWritable(sum));
  33. }
  34. }
  35. public static void main(String[] args) throws Exception {
  36. Configuration conf = new Configuration();
  37. Job job = new Job(conf, "wordcount");
  38. job.setOutputKeyClass(Text.class);
  39. job.setOutputValueClass(IntWritable.class);
  40. job.setMapperClass(Map.class);
  41. job.setReducerClass(Reduce.class);
  42. **job.setNumReduceTasks( 0 );**
  43. job.setInputFormatClass(TextInputFormat.class);
  44. job.setOutputFormatClass(TextOutputFormat.class);
  45. FileInputFormat.addInputPath(job, new Path(args[0]));
  46. FileOutputFormat.setOutputPath(job, new Path(args[1]));
  47. job.waitForCompletion(true);
  48. }
  49. }

源代码

展开查看全部

相关问题