java—使用hadoop作业的输出作为另一个作业的输入

dgtucam1 于 2021-06-03 发布在 Hadoop

关注(0)|答案(1)|浏览(341)

我在使用一个m/r作业的输出作为另一个m/r作业的输入时遇到了问题。根据本文和许多其他在线资源，一种方法是创建job1，然后创建job2。但是，当我这样做时，我得到了以下错误：

Error: java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to org.apache.hadoop.io.LongWritable

我唯一一次使用longwritable作为类，是作为Map器的键。我记得这需要保持这样，因为这是输入文件中的偏移量。当我更改签名时 Text ，就像这样：

public class ErrorMapperCombiner extends Mapper<Text, Text, Text, IntWritable>

我得到这个错误：

Error: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable cannot be cast to org.apache.hadoop.io.Text

那么，如何使用一个m/r作业的输出作为另一个m/r作业的输入呢？
我在我的“runner”课程中也用这个来链接：

job1.setOutputFormatClass(SequenceFileOutputFormat.class);

job1.setOutputKeyClass(Text.class);
job1.setOutputValueClass(IntWritable.class);

.....

job2.setInputFormatClass(SequenceFileInputFormat.class);

Java hadoop hdfs mapreduce

来源：https://stackoverflow.com/questions/19994120/use-the-output-of-hadoop-job-as-input-of-another

1条答案

按热度按时间

xvw2m8pv1#

我设置了两次值(e、 g.）

job1.setOutputFormatClass(TextOutputFormat.class);
....
job1.setOutputFormatClass(SequenceFileOutputFormat.class);

尽管 SequenceFileOutputFormat 在它之后，所以我认为这个值会被使用。但不管怎样，它现在起作用了。

赞(0）回复(0）举报 2021-06-04

我来回答

java—使用hadoop作业的输出作为另一个作业的输入

1条答案

相关问题

热门标签

最新问答