map-reduce作业生成空输出文件

dzhpxtsq  于 2021-06-02  发布在  Hadoop
关注(0)|答案(2)|浏览(474)

程序正在生成空输出文件。谁能告诉我哪里出了问题吗。任何帮助都将不胜感激。我尝试放置job.setnumreducetask(0),因为我没有使用reducer,但输出文件仍然为空。

public static class PrizeDisMapper extends Mapper<LongWritable, Text, Text, Pair>{
int rating = 0;
Text CustID;
IntWritable r;
Text MovieID;
public void map(LongWritable key, Text line, Context context
                ) throws IOException, InterruptedException {
        String line1 = line.toString();
        String [] fields = line1.split(":");
        if(fields.length > 1)
             {
             String Movieid = fields[0];
             String line2 = fields[1];
             String [] splitline = line2.split(",");
             String Custid = splitline[0];
             int rate = Integer.parseInt(splitline[1]);
             r = new IntWritable(rate);
             CustID = new Text(Custid);
             MovieID = new Text(Movieid);
             Pair P = new Pair();
             context.write(MovieID,P);
             }
             else
             {
             return;
             }
  }
}

 public static class IntSumReducer extends Reducer<Text,Pair,Text,Pair> {
 private IntWritable result = new IntWritable();
 public void reduce(Text key, Iterable<Pair> values,
                   Context context
                   ) throws IOException, InterruptedException {
  for (Pair val : values) {
    context.write(key, val);
  }
  }

  public class Pair implements Writable
  {
  String key;
  int value;
    public void write(DataOutput out) throws IOException {
     out.writeInt(value);
     out.writeChars(key);
  }
  public void readFields(DataInput in) throws IOException {
     key = in.readUTF();
     value = in.readInt();
  }
  public void setVal(String aKey, int aValue)
  {
     key   = aKey;
      value = aValue;
  }

主要类别:

public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();
if (otherArgs.length != 2) {
  System.err.println("Usage: wordcount <in> <out>");
  System.exit(2);
}
Job job = new Job(conf, "word count");
job.setJarByClass(WordCount.class);
job.setMapperClass(TokenizerMapper.class);
job.setCombinerClass(IntSumReducer.class);
job.setReducerClass(IntSumReducer.class);
job.setInputFormatClass (TextInputFormat.class);
FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Pair.class);
System.exit(job.waitForCompletion(true) ? 0 : 1);

感谢@pathmanaban palsamy和@chris gerken的建议。我已经按照你的建议修改了代码,但是仍然得到空的输出文件。有谁能给我推荐一下我的主类中输入和输出的配置吗。我需要在mapper的输入中指定pair类吗&how?

prdp8dxp

prdp8dxp1#

我猜reduce方法应该声明为

public void reduce(Text key, Iterable<Pair> values,
               Context context
               ) throws IOException, InterruptedException

您将获得一个iterable(一个可以从中获得迭代器的对象),用于迭代Map到给定键的所有值。

monwx1rj

monwx1rj2#

因为不需要减速器,我怀疑是线以下 Pair P = new Pair(); context.write(MovieID,P); 空对就是问题所在。另外,请检查您的驱动程序类您已经给出了正确的keyclass和valueclass如 job.setMapOutputKeyClass(Text.class); job.setMapOutputValueClass(Pair.class);

相关问题