将数据从mongodb复制到hdfs时发生hadoop jar错误

4dbbbstv  于 2021-06-01  发布在  Hadoop
关注(0)|答案(1)|浏览(307)

我尝试使用hadoopmongodb连接器将集合从mongodb复制到hadoop,代码如下:package hdfs;

  1. import java.io.*;
  2. import org.apache.commons.logging.*;
  3. import org.apache.hadoop.conf.*;
  4. import org.apache.hadoop.fs.Path;
  5. import org.apache.hadoop.io.*;
  6. import org.apache.hadoop.mapreduce.lib.output.*;
  7. import org.apache.hadoop.mapreduce.*;
  8. import org.bson.*;
  9. import com.mongodb.hadoop.*;
  10. import com.mongodb.hadoop.util.*;
  11. public class ImportWeblogsFromMongo {
  12. private static final Log log = LogFactory.getLog(ImportWeblogsFromMongo.class);
  13. public static class ReadWeblogsFromMongo extends Mapper<Object, BSONObject, Text, Text> {
  14. public void map(Object key, BSONObject value, Context context) throws IOException, InterruptedException {
  15. System.out.println("Key: " + key);
  16. System.out.println("Value: " + value);
  17. String md5 = value.get("md5").toString();
  18. String url = value.get("url").toString();
  19. String date = value.get("date").toString();
  20. String time = value.get("time").toString();
  21. String ip = value.get("ip").toString();
  22. String output = "\t" + url + "\t" + date + "\t" + time + "\t" + ip;
  23. context.write(new Text(md5), new Text(output));
  24. }
  25. }
  26. public static void main(String[] args) throws Exception {
  27. final Configuration conf = new Configuration();
  28. MongoConfigUtil.setInputURI(conf,"mongodb://localhost:27017/clusterdb.fish");
  29. MongoConfigUtil.setCreateInputSplits(conf, false);
  30. System.out.println("Configuration: " + conf);
  31. @SuppressWarnings("deprecation")
  32. final Job job = new Job(conf, "Mongo Import");
  33. Path out = new Path("/home/mongo_import");
  34. FileOutputFormat.setOutputPath(job, out);
  35. job.setJarByClass(ImportWeblogsFromMongo.class);
  36. job.setMapperClass(ReadWeblogsFromMongo.class);
  37. job.setOutputKeyClass(Text.class);
  38. job.setOutputValueClass(Text.class);
  39. job.setInputFormatClass(MongoInputFormat.class);
  40. job.setOutputFormatClass(TextOutputFormat.class);
  41. job.setNumReduceTasks(0);
  42. System.exit(job.waitForCompletion(true) ? 0 : 1);
  43. }
  44. }

1导出名为 importmongo.jar 2我试图执行这个命令 hadoop jar /home/yass/importmongo.jar hdfs.ImportWeblogsFromMongo 但我有以下错误:

  1. Exception in thread "main" java.lang.NoClassDefFoundError: com/mongodb/hadoop/util/MongoConfigUtil
  2. at hdfs.ImportWeblogsFromMongo.main(ImportWeblogsFromMongo.java:33)
  3. at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  4. at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  5. at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  6. at java.lang.reflect.Method.invoke(Method.java:498)
  7. at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
  8. at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
  9. Caused by: java.lang.ClassNotFoundException: com.mongodb.hadoop.util.MongoConfigUtil
  10. at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
  11. at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
  12. at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
  13. ... 7 more

注意:clustedb是数据库名,fish是它的集合,hdfs.importweblogsfrommongo是package.class
有什么建议吗

t2a7ltrp

t2a7ltrp1#

我没有用这种方法解决这个问题,但我用 Mongodump 通过将文件复制到 Hdfs ,下面的行可以帮助某人完成工作

  1. mongodump --db clusterdb --collection CollectionName
  2. bsondump file.bson > file.json
  3. hadoop dfs -copyFromLocal /path/to/file/fish.json mongo

相关问题