java—在hadoop中，如何获取值中的最后一个元素

puruo6ea 于 2021-05-29 发布在 Hadoop

关注(0)|答案(1)|浏览(348)

以下是.csv中的一些输入数据，例如：
URL 1 a
url2 b型
URL 3 c
url4和d
url5电子邮件
URL 1 k
url1小时
URL 2 x
url5米
我想要的是：
url1小时
URL 2 x
URL 3 c
url4和d
url5米
但我得到的是：
URL 1 a
url2 b型
URL 3 c
url4和d
url5电子邮件
我不知道我的代码出了什么问题，下面是我程序的一些代码：
功能图：

public class MergeUrlMapper extends MapReduceBase implements Mapper<LongWritable, Text, Text, Text> {
public void map(LongWritable key, Text value, OutputCollector<Text, Text> output, Reporter reporter) throws IOException {
    String valueString = value.toString();
    String[] UrlHtmlData = valueString.split(",");
    output.collect(new Text(UrlHtmlData[0]), new Text(UrlHtmlData[1]));
}
}

和功能减少：

public class MergeUrlReducer extends MapReduceBase implements Reducer<Text, Text, Text, Text> {
public void reduce(Text t_key, Iterator<Text> values, OutputCollector<Text,Text> output, Reporter reporter) throws IOException {
    Text key = t_key;
    // if values is empty,then output will be (t_key,t_key)
    Text latestHtml = t_key;
    while (values.hasNext()) {
        Text temp = values.next();
        latestHtml = temp;
    }
    output.collect(key, latestHtml);
}
}

我的代码有什么问题，输出应该是最后一个值，但实际上它是第一个值。提前谢谢！

Java hadoop mapreduce

来源：https://stackoverflow.com/questions/50550432/in-hadoop-how-to-get-last-element-in-values

1条答案

按热度按时间

xqkwcwgp1#

值的顺序不能保证。
如果要按照某种顺序对它们进行排序，则需要将所有迭代器值添加到arraylist中，然后调用 Collections.sort 如果你想的话，可以用一个定制的比较器。
然后获取元素 list.size() - 1 另外，根据您的问题，您的输入不包含逗号，因此请确保您使用的字符是正确的。

赞(0）回复(0）举报 2021-05-29

我来回答

java—在hadoop中，如何获取值中的最后一个元素

1条答案

相关问题

热门标签

最新问答