hadoop jar命令错误

hmtdttj4  于 2021-06-03  发布在  Hadoop
关注(0)|答案(4)|浏览(451)

在hdfs上执行jar file命令时,出现如下错误

  1. # hadoop jar WordCountNew.jar WordCountNew /MRInput57/Input-Big.txt /MROutput57
  2. 15/11/06 19:46:32 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
  3. 15/11/06 19:46:32 INFO mapred.JobClient: Cleaning up the staging area hdfs://localhost:8020/var/lib/hadoop-0.20/cache/mapred/mapred/staging/root/.staging/job_201511061734_0003
  4. 15/11/06 19:46:32 ERROR security.UserGroupInformation: PriviledgedActionException as:root (auth:SIMPLE) cause:org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory /MRInput57/Input-Big.txt already exists
  5. Exception in thread "main" org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory /MRInput57/Input-Big.txt already exists
  6. at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:132)
  7. at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:921)
  8. at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:882)
  9. at java.security.AccessController.doPrivileged(Native Method)
  10. at javax.security.auth.Subject.doAs(Subject.java:396)
  11. at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278)
  12. at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:882)
  13. at org.apache.hadoop.mapreduce.Job.submit(Job.java:526)
  14. at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:556)
  15. at MapReduce.WordCountNew.main(WordCountNew.java:114)
  16. at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  17. at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
  18. at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
  19. at java.lang.reflect.Method.invoke(Method.java:597)
  20. at org.apache.hadoop.util.RunJar.main(RunJar.java:197)
  21. My Driver class Program is as below
  22. public static void main(String[] args) throws IOException, Exception {
  23. // Configutation details w. r. t. Job, Jar file
  24. Configuration conf = new Configuration();
  25. Job job = new Job(conf, "WORDCOUNTJOB");
  26. // Setting Driver class
  27. job.setJarByClass(MapReduceWordCount.class);
  28. // Setting the Mapper class
  29. job.setMapperClass(TokenizerMapper.class);
  30. // Setting the Combiner class
  31. job.setCombinerClass(IntSumReducer.class);
  32. // Setting the Reducer class
  33. job.setReducerClass(IntSumReducer.class);
  34. // Setting the Output Key class
  35. job.setOutputKeyClass(Text.class);
  36. // Setting the Output value class
  37. job.setOutputValueClass(IntWritable.class);
  38. // Adding the Input path
  39. FileInputFormat.addInputPath(job, new Path(args[0]));
  40. // Setting the output path
  41. FileOutputFormat.setOutputPath(job, new Path(args[1]));
  42. // System exit strategy
  43. System.exit(job.waitForCompletion(true) ? 0 : 1);
  44. }

有人能纠正我代码中的问题吗?
向普拉纳夫问好

rsaldnfx

rsaldnfx1#

您需要检查输出目录是否已经存在,如果已经存在,请将其删除。mapreduce无法(或不会)将文件写入存在的目录。它需要创建目录才能确定。
添加以下内容:

  1. Path outPath = new Path(args[1]);
  2. FileSystem dfs = FileSystem.get(outPath.toUri(), conf);
  3. if (dfs.exists(outPath)) {
  4. dfs.delete(outPath, true);
  5. }
cfh9epnr

cfh9epnr2#

您试图创建以存储输出的输出目录已存在。因此请尝试删除以前同名的目录或更改输出目录的名称。

xkftehaa

xkftehaa3#

正如其他人所注意到的,您得到这个错误是因为输出目录已经存在,很可能是因为您以前尝试过执行这个作业。
您可以在运行作业之前删除现有的输出目录,即:

  1. # hadoop fs -rm -r /MROutput57 && \
  2. hadoop jar WordCountNew.jar WordCountNew /MRInput57/Input-Big.txt /MROutput57
qyzbxkaa

qyzbxkaa4#

在执行程序之前,输出目录不应存在。删除现有目录或提供新目录,或删除程序中的输出目录。
我更喜欢在命令提示符下执行程序之前从命令提示符下删除输出目录。
从命令提示符:

  1. hdfs dfs -rm -r <your_output_directory_HDFS_URL>

来自java:

  1. Chris Gerken code is good enough.

相关问题