尝试在hadoop中开始作业时出错

4si2a6ki 于 2021-06-04 发布在 Hadoop

关注(0)|答案(4)|浏览(342)

我一直尝试在hadoop中使用pagerank算法，但在初始化作业时遇到了一些问题。
当我尝试使用作业类进行初始化时，在编译时出现以下错误：
线程“main”java.lang.noclassdeffounderror中出现异常：org/apache/commons/logging/logfactory，位于org.apache.hadoop.mapreduce.job。java:89)在pagerank.main（pagerank。java:244)
代码如下：

Job job;
job = new Job();
job.setJarByClass(Pagerank.class);      // In what class are our map/reduce functions for this job found?
job.setMapperClass(PRMap.class);        // What is our map function for this job?
job.setReducerClass(PRReduce.class);    // What is our reduce function for this job?

job.setOutputKeyClass(Text.class);              // What are the (hadoop.io compliant) datatype for our
job.setOutputValueClass(Text.class);            // reducer output's key-value pairs?
job.setInputFormatClass(TextInputFormat.class);     // How will the mapper distinguish (key value) record inputs?
FileInputFormat.addInputPath(job, new Path(args[0])); // First command line argument
FileOutputFormat.setOutputPath(job, new Path("temp0"));
job.waitForCompletion(true);

当我尝试使用jobconf类进行初始化时，我得到一个关于所使用的某些方法的参数的错误。
代码如下：

JobConf conf = new JobConf(Pagerank.class);
     conf.setJobName("pagerank");

     conf.setOutputKeyClass(Text.class);
     conf.setOutputValueClass(Text.class);

     conf.setMapperClass(PRMap.class);
     conf.setReducerClass(PRReduce.class);

     conf.setInputFormat(TextInputFormat.class);
     conf.setOutputFormat(TextOutputFormat.class);

     FileInputFormat.setInputPaths(conf, new Path(args[0]));
     FileOutputFormat.setOutputPath(conf, new Path(args[1]));

     JobClient.runJob(conf);

根据错误：
jobconf类中的方法setmapperclass不能应用于给定类型；
必需：类？扩展Map器
找到：类prmap
原因：实际参数类prmap无法转换为类？通过方法调用转换扩展Map器
似乎我无法将prmap.class作为参数传递给setmapperclass，即使我编写的prmap类遵循hadoop的map函数标准

public static class PRMap extends Mapper<LongWritable, Text, Text, Text>
{ ... }

对这两种方法有什么建议吗？

Java hadoop Configuration

来源：https://stackoverflow.com/questions/14763655/error-while-trying-to-begin-a-job-in-hadoop

4条答案

按热度按时间

0ve6wy6x1#

看起来prmap类扩展了org.apache.hadoop.mapreduce.mapperhttp://hadoop.apache.org/docs/mapreduce/current/api/org/apache/hadoop/mapreduce/mapper.html 需要通过jobconf传递的类应该是org.apache.hadoop.mapred.mapper的子类。
要解决java.lang.noclassdeffounderror的问题，请将commons-logging-x.x.x.jar添加到类路径中。
运行hadoop classpath以确认是否显示jar。

赞(0）回复(0）举报 2021-06-04

qgelzfjb2#

这是因为Map器无法找到 LogFactory ，这是 common-loggings.jar . 为此，您必须使每个客户机Map程序都可以访问它，方法是将jar复制到所有计算机上，或者通过其他有效的方法将jar复制到分布式缓存中。

$bin/hadoop fs -copyFromLocal mylib.jar /myapp/mylib.jar
And accessing it from you code
DistributedCache.addFileToClassPath(new Path("/myapp/mylib.jar"), job);

更多信息可以在这里找到

赞(0）回复(0）举报 2021-06-04

0aydgbwb3#

尝试将包含org.apache.commons.logging.logfactory jar的jar放到每台机器的hadoophome的lib目录中，然后重新启动集群。
或者可以尝试使用libjars选项通过命令行添加jar。作为：
hadoop jar myjar.jar包.classname-libjars mypath/common-loggings.jar

赞(0）回复(0）举报 2021-06-04

kpbwa7wx4#

在main方法中添加这一行。

DistributedCache.addFileToClassPath(new Path("<Absolute Path>/common-loggings.jar"), conf);

赞(0）回复(0）举报 2021-06-04

我来回答

尝试在hadoop中开始作业时出错

4条答案

相关问题

热门标签

最新问答