远程运行hadoop作业

lsmepo6l  于 2021-06-04  发布在  Hadoop
关注(0)|答案(1)|浏览(320)

我正在尝试从群集中外部运行mapreduce作业。
e、 g.hadoop集群在linux机器上运行。我们有一个在windows机器上运行的web应用程序。我们想从这个远程web应用程序运行hadoop作业。我们希望检索hadoop输出目录并将其显示为一个图形。
我们编写了以下代码:

  1. Configuration conf = new Configuration();
  2. Job job = new Job(conf);
  3. conf.set("mapred.job.tracker", "192.168.56.101:54311");
  4. conf.set("fs.default.name", "hdfs://192.168.56.101:54310");
  5. job.setJarByClass(Analysis.class) ;
  6. //job.setOutputKeyClass(Text.class);
  7. //job.setOutputValueClass(IntWritable.class);
  8. job.setMapperClass(Map.class);
  9. job.setReducerClass(Reduce.class);
  10. //job.set
  11. job.setInputFormatClass(CustomFileInputFormat.class);
  12. job.setOutputFormatClass(TextOutputFormat.class);
  13. FileInputFormat.addInputPath(job, new Path(args[0]));
  14. FileOutputFormat.setOutputPath(job, new Path(args[1]));
  15. job.setMapOutputKeyClass(Text.class);
  16. job.setMapOutputValueClass(IntWritable.class);
  17. job.setOutputKeyClass(Text.class);
  18. job.setOutputValueClass(IntWritable.class);
  19. job.waitForCompletion(true);

这就是我们得到的错误。即使我们关闭了hadoop1.1.2集群,错误仍然是一样的。

  1. 14/03/07 00:23:37 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
  2. 14/03/07 00:23:37 ERROR security.UserGroupInformation: PriviledgedActionException as:user cause:java.io.IOException: Failed to set permissions of path: \tmp\hadoop-user\mapred\staging\user818037780\.staging to 0700
  3. Exception in thread "main" java.io.IOException: Failed to set permissions of path: \tmp\hadoop-user\mapred\staging\user818037780\.staging to 0700
  4. at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:691)
  5. at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:664)
  6. at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:514)
  7. at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:349)
  8. at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:193)
  9. at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:126)
  10. at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:942)
  11. at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936)
  12. at java.security.AccessController.doPrivileged(Native Method)
  13. at javax.security.auth.Subject.doAs(Subject.java:396)
  14. at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
  15. at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936)
  16. at org.apache.hadoop.mapreduce.Job.submit(Job.java:550)
  17. at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:580)
  18. at LineCounter.main(LineCounter.java:86)
5w9g7ksd

5w9g7ksd1#

从远程系统运行时,应以远程用户身份运行。你可以在你的主课上做如下:

  1. public static void main(String a[]) {
  2. UserGroupInformation ugi
  3. = UserGroupInformation.createRemoteUser("root");
  4. try {
  5. ugi.doAs(new PrivilegedExceptionAction<Void>() {
  6. public Void run() throws Exception {
  7. Configuration conf = new Configuration();
  8. Job job = new Job(conf);
  9. conf.set("hadoop.job.ugi", "root");
  10. // write your remaining piece of code here.
  11. return null;
  12. }
  13. });
  14. } catch (Exception e) {
  15. e.printStackTrace();
  16. }
  17. }

另外,在提交mapreduce作业时,它应该将java类及其相关JAR复制到hadoop集群,在那里执行mapreduce作业。
因此,您需要用所有依赖的jar文件inits manifest classpath为代码创建一个可运行的jar(在本例中使用主类分析)。然后使用命令行运行jar文件

  1. java -jar job-jar-with-dependencies.jar arguments

嗯!

展开查看全部

相关问题