javamapreduce-如何从reducer类中的可写和输出前10名

oknwwptz  于 2021-06-02  发布在  Hadoop
关注(0)|答案(1)|浏览(310)

我很难为前10个(键,值)对输出编写reducer代码。
我当前的输出格式为((年,市场),总量)。我要找的是每年前10名的总金额。我当前的代码是每年为每个市场输出每个金额。
如有任何建议,我们将不胜感激!
Map器:

public class FundingMapper extends Mapper<LongWritable, Text, Text, IntWritable> {

private Text Year = new Text();
private Text Market = new Text();

public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {

    String line = value.toString();
    CSVReader reader = new CSVReader(new StringReader(line));

    String[] array = reader.readNext();
    reader.close();

    Year.set(array[14]);
    Market.set(array[3]);

    String amountString = array[15].replaceAll("[^0-9]","");
    int amount = 0;

    try {
        amount = Integer.parseInt(amountString);
    }

    catch(NumberFormatException nfe) {
        return;
    }

    IntWritable intW = new IntWritable(amount);

    String S = new StringBuilder().append(Year + " ").append(Market + " ").toString();

    context.write(new Text(S), intW);
}
}

减速器:

public class FundingReducer extends Reducer<Text, IntWritable, Text, IntWritable> {

public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, 
        InterruptedException {

    int sum = 0;

    for(IntWritable value : values) {
        sum += value.get();
    }

    context.write(key, new IntWritable(sum));
}
}

数据样本:

/organization/contravir-pharmaceuticals ContraVir Pharmaceuticals   |Biotechnology| Biotechnology   USA NY  New York City   New York    /funding-round/9a7cc724deba554585e2b79c14605866 post_ipo_equity     8/22/14 2014-08      2014-Q3    2014    4,742,648

/organization/contravir-pharmaceuticals ContraVir Pharmaceuticals   |Biotechnology| Biotechnology   USA NY  New York City   New York    /funding-round/04a7ec54417a0f9a6c99cf8db2eac819 venture A   10/15/14    2014-10  2014-Q4    2014    9,000,000    

/organization/contravir-pharmaceuticals ContraVir Pharmaceuticals   |Biotechnology| Biotechnology   USA NY  New York City   New York    /funding-round/328384053df3a992ca6d5da55ca0420e venture     2/14/14 2014-02  2014-Q1    2014    3,225,000    

/organization/contrib-com   contrib.com |Entrepreneur|Technology|Domains|Education|Social Media|    Social Media    USA FL  Palm Beaches    Delray Beach    /funding-round/fea112ed22657c1456820aa26af3ab17 seed        6/17/14 2014-06  2014-Q2    2014    300,000

输出样本:

2014  Biotechnology  16967648
2014  Social Media  300000
wooyq4lh

wooyq4lh1#

您需要在Map输出中输入key as year。这将确保您每年在reducer中的某个时间获得值。然后你可以过滤出10个值到你的输出中。看看下面。

public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {

        String line = value.toString();
        CSVReader reader = new CSVReader(new StringReader(line));

        String[] array = reader.readNext();
        reader.close();

        Year.set(array[14]);
        Market.set(array[3]);

        String amountString = array[15].replaceAll("[^0-9]","");
        int amount = 0;

        try {
            amount = Integer.parseInt(amountString);
        }

        catch(NumberFormatException nfe) {
            return;
        }

        IntWritable intW = new IntWritable(amount);

        context.write(new Intwritable(Year), new Text(amount +" "+ market));
    }

    public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, 
            InterruptedException {

        int count= 0;
        int amount =0;
        string market = "";
        for(IntWritable value : values) {
           market = value.toString().split(" ")[1];
           amount = Integer.parseInt(value.toString.split(" ")[0])
            if(count < 10){
              count ++;
              context.write(key, value);
          }
else
 break;
        }

       // context.write(key, new IntWritable(sum));
    }

相关问题