在spring数据hadoop上运行作业时出现的问题

hlswsv35  于 2021-06-03  发布在  Hadoop
关注(0)|答案(1)|浏览(444)

我用mahout创建了以下Map器和还原器

  1. package mypackage.ItemSimilarity;
  2. import java.io.IOException;
  3. import java.util.regex.Matcher;
  4. import java.util.regex.Pattern;
  5. import org.apache.hadoop.io.LongWritable;
  6. import org.apache.hadoop.io.Text;
  7. import org.apache.hadoop.mapreduce.Mapper;
  8. import org.apache.mahout.math.VarLongWritable;
  9. public class ItemPrefMapper extends
  10. Mapper<LongWritable, Text, VarLongWritable, VarLongWritable> {
  11. private static final Pattern NUMBERS = Pattern.compile("(\\d+)");
  12. @Override
  13. public void map(LongWritable key, Text value, Context context)
  14. throws IOException, InterruptedException {
  15. String line = value.toString();
  16. Matcher m = NUMBERS.matcher(line);
  17. m.find();
  18. VarLongWritable userID = new VarLongWritable(Long.parseLong(m.group()));
  19. VarLongWritable itemID = new VarLongWritable();
  20. while (m.find()) {
  21. itemID.set(Long.parseLong(m.group()));
  22. context.write(userID, itemID);
  23. }
  24. }
  25. }

降低等级

  1. package mypackage.ItemSimilarity;
  2. import java.io.IOException;
  3. import org.apache.hadoop.mapreduce.Reducer;
  4. import org.apache.mahout.math.RandomAccessSparseVector;
  5. import org.apache.mahout.math.VarLongWritable;
  6. import org.apache.mahout.math.Vector;
  7. import org.apache.mahout.math.VectorWritable;
  8. public class UserVectorReducer
  9. extends
  10. Reducer<VarLongWritable, VarLongWritable, VarLongWritable, VectorWritable> {
  11. @Override
  12. public void reduce(VarLongWritable userID,
  13. Iterable<VarLongWritable> itemPrefs, Context context)
  14. throws IOException, InterruptedException {
  15. Vector userVector = new RandomAccessSparseVector(Integer.MAX_VALUE, 100);
  16. for (VarLongWritable itemPref : itemPrefs) {
  17. userVector.set((int) itemPref.get(), 1.0f);
  18. }
  19. context.write(userID, new VectorWritable(userVector));
  20. }
  21. }

运行此命令的spring配置

  1. <job id="mahoutJob" input-path="/home/ubuntu/input/data.txt" output-path="/home/ubuntu/output"
  2. mapper="mypackage.ItemSimilarity.ItemPrefMapper"
  3. reducer="mypackage.ItemSimilarity.UserVectorReducer"
  4. jar-by-class="mypackage.ItemSimilarity.ItemPrefMapper"/>
  5. <job-runner id="myjob-runner" pre-action="setupScript" job-ref="mahoutJob"
  6. run-at-startup="true"/>

当我运行这个时,我得到了以下错误。我已经扩展了hadoopMap器类,但是spring说它不是一个Map器类。
java.lang.runtimeexception:class mypackage.itemsimilarity.itemprefmapper not org.apache.hadoop.mapreduce.mapper at org.apache.hadoop.conf.configuration.setclass(配置)。java:931)在org.apache.hadoop.mapreduce.job.setmapperclass(作业。java:175)在org.springframework.data.hadoop.mapreduce.jobfactorybean.afterpropertiesset(jobfactorybean。java:153)位于org.springframework.beans.factory.support.abstractautowirecapablebeanfactory.invokeinitmethods(abstractautowirecapablebeanfactory)。java:1571)在org.springframework.beans.factory.support.abstractautowirecapablebeanfactory.initializebean(abstractautowirecapablebeanfactory)。java:1509)位于org.springframework.beans.factory.support.abstractautowirecapablebeanfactory.docreatebean(abstractautowirecapablebeanfactory)。java:521)在org.springframework.beans.factory.support.abstractautowirecapablebeanfactory.createbean(abstractautowirecapablebeanfactory)。java:458)

von4xj4u

von4xj4u1#

你确定你的jar-by-class元素吗?因为它应该指向类似main方法的东西,在这里您可以示例化applicationcontext示例。
另外,你确定你的包裹名称吗?
com.threepillar.labs.itemsimilarity.itemprefmapper
以及
mypackage.itemsimilarity.itemprefmapper

相关问题