我得到以下错误:
java.lang.Exception: java.io.IOException: Type mismatch in key from map: expected org.apache.hadoop.io.Text, received RegexMatcher.CustomKey
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.io.IOException: Type mismatch in key from map: expected org.apache.hadoop.io.Text, received RegexMatcher.CustomKey
首先,我在map reduce中定义一个名为customkey的自定义数据类型:
public class CustomKey implements Writable {
public Text userId;
public Text friendId;
public CustomKey() {
this.userId = new Text();
this.friendId = new Text();
}
public CustomKey(String userId, String friendId) {
this.userId = new Text(userId);
this.friendId = new Text(friendId);
}
@Override
public void write(DataOutput out) throws IOException {
userId.write(out);
userId.write(out);
}
@Override
public void readFields(DataInput in) throws IOException {
userId.readFields(in);
friendId.readFields(in);
}
}
然后我创建一个Map器singleclassv2logmapper
public static class SingleClassv2LogMapper extends Mapper<Object, Text, CustomKey, IntWritable> {
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
@Override
protected void map(Object key, Text value, Context context) throws IOException, InterruptedException {
Configuration conf = context.getConfiguration();
String regex = conf.get("regex");
String delimeter = conf.get("delimeter");
String currentLine = value.toString();
String tag = RingIdLogParser.parseHashTag(value.toString());
String body = RingIdLogParser.parseBody(value.toString());
if (tag != null) {
if (tag.equals(RegularExpressionBundle.updateMultipleMessageStatus)) {
CustomKey customKey = RingIdLogParser.parseUserFrinedInfo(body);
int messageNo = RingIdLogParser.getMessageCount(body);
context.write(customKey, new IntWritable(messageNo));
}
}
}
}
和减速器
public static class SingleClassv2LogReducer extends Reducer<CustomKey, IntWritable, Text, IntWritable> {
TextArrayWritable sum = new TextArrayWritable();
@Override
protected void reduce(CustomKey key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {
int sum = 0;
for (IntWritable value : values) {
sum = sum + value.get();
}
String compactUser = key.userId.toString() +" "+ key.friendId.toString();
context.write(new Text(compactUser), new IntWritable(sum));
}
}
我现在该怎么办?有人来帮我吗。
与驱动程序相关的代码如下所示
Configuration conf = new Configuration();
Job job = Job.getInstance(conf, "Regex Matcher");
job.setJarByClass(SingleClassLogDriverv2.class);
job.setMapperClass(SingleClassv2LogMapper.class);
job.setCombinerClass(SingleClassv2LogCombiner.class);
job.setReducerClass(SingleClassv2LogReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
job.setMapOutputKeyClass(CustomKey.class);
job.setMapOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
System.exit(job.waitForCompletion(true) ? 0 : 1);
1条答案
按热度按时间hjzp0vay1#
在使用eclipseformap reduce和compariability创建jar时,我也遇到了类似的问题,我的问题是除了传统的遗留wordcount程序之外,还要在数字前面打印390k个数字的单词计数。这里是我的12个文件中的数字列表,其中也包含一次冗余。
我后来纠正了,我想在下面说的聚合结果-
我已经开发了一个maven构建工具,用于以文字形式打印数字,从而将jar显式地添加到我的项目中。
所以,我们来看看我的程序,它类似于wordcount程序,但用途不同-
我知道这是一个太迟的答复,但希望这将有助于找到一个方法,也有人,谢谢。