无法提交并发hadoop作业

wh6knrhe  于 2021-05-31  发布在  Hadoop
关注(0)|答案(1)|浏览(382)

我在跑步 Hadoop 2.7 在我的本地机器上,还有 HBase 1.4 以及 Phoenix 4.15 . 我已经写了一个应用程序,它通过phoenix提交map reduce作业,删除hbase中的数据。每个作业都由 ThreadPoolExecutor 看起来是这样的:

  1. public class MRDeleteTask extends Task {
  2. private final Logger LOGGER = LoggerFactory.getLogger(MRDeleteTask.class);
  3. private String query;
  4. public MRDeleteTask(int id, String q) {
  5. this.setId(id);
  6. this.query = q;
  7. }
  8. @Override
  9. public void run() {
  10. LOGGER.info("Running Task: " + getId());
  11. try {
  12. Configuration configuration = HBaseConfiguration.create();
  13. Job job = Job.getInstance(configuration, "phoenix-mr-job-"+getId());
  14. LOGGER.info("mapper input: " + this.query);
  15. PhoenixMapReduceUtil.setInput(job, DeleteMR.PhoenixDBWritable.class, "Table", QUERY);
  16. job.setMapperClass(DeleteMR.DeleteMapper.class);
  17. job.setJarByClass(DeleteMR.class);
  18. job.setNumReduceTasks(0);
  19. job.setOutputFormatClass(NullOutputFormat.class);
  20. job.setOutputKeyClass(ImmutableBytesWritable.class);
  21. job.setOutputValueClass(Writable.class);
  22. TableMapReduceUtil.addDependencyJars(job);
  23. boolean result = job.waitForCompletion(true);
  24. }
  25. catch (Exception e) {
  26. LOGGER.info(e.getMessage());
  27. }
  28. }
  29. }

如果threadpoolexecutor中只有一个线程,那么一切都很好。如果同时提交多个这样的hadoop作业,则什么也不会发生。根据日志,错误如下所示:

  1. 4439 [pool-1-thread-2] INFO MRDeleteTask - java.util.concurrent.ExecutionException: java.io.IOException: Unable to rename file: [/tmp/hadoop-user/mapred/local/1595274269610_tmp/tmp_phoenix-4.15.0-HBase-1.4-client.jar] to [/tmp/hadoop-user/mapred/local/1595274269610_tmp/phoenix-4.15.0-HBase-1.4-client.jar]
  2. 4439 [pool-1-thread-1] INFO MRDeleteTask - java.util.concurrent.ExecutionException: ExitCodeException exitCode=1: chmod: /private/tmp/hadoop-user/mapred/local/1595274269610_tmp/phoenix-4.15.0-HBase-1.4-client.jar: No such file or directory

任务是使用 ThreadPoolExecutor.submit() 它们的状态正在使用返回的未来进行检查 future.isDone() .

nom7f22z

nom7f22z1#

这些作业没有提交给yarn,而是从intellij本地运行。将以下内容添加到作业配置中解决了此问题:

  1. conf.set("mapreduce.framework.name", "yarn");

相关问题