步骤失败,exitcode、amazon emr hadoop、s3distcp

mzillmmw  于 2021-06-02  发布在  Hadoop
关注(0)|答案(1)|浏览(464)

我正在尝试创建一个“步骤”,并将许多小文件收集到一个文件中,这样我就可以将其分隔几天。问题是我不想跑,不让我跑。
执行它对我很有效命令:

  1. hadoop distcp s3n://buket-name/output-files-hive/* s3n://buket-name/files-hive/test

但是,如果我已经输入了命令“groupby”或“srcpattern”,那我就什么都不知道了。
在amazon emr控制台中创建了“step”之后,给了我所有的时间错误。你指的是文件
命令:

  1. aws emr add-steps --cluster-id j-XXXXXXX --steps Name="S3DistCp step",Jar="command-runner.jar",Args=["spark-submit","--src=s3n://buket-name/output-files-hive/output-files-hive/*","--dest=s3n://buket-name/output-files-hive/files-hive/test/"]

错误:

  1. 2016-07-13T15:06:27.677Z INFO Ensure step 3 jar file command-runner.jar
  2. 2016-07-13T15:06:27.678Z INFO StepRunner: Created Runner for step 3
  3. INFO startExec 'hadoop jar /var/lib/aws/emr/step-runner/hadoop-jars/command-runner.jar spark-submit --src=s3n://buket-name/output-files-hive/* --dest=s3n://buket-name/files-hive/test/'
  4. INFO Environment:
  5. TERM=linux
  6. CONSOLETYPE=serial
  7. SHLVL=5
  8. JAVA_HOME=/etc/alternatives/jre
  9. HADOOP_IDENT_STRING=hadoop
  10. LANGSH_SOURCED=1
  11. XFILESEARCHPATH=/usr/dt/app-defaults/%L/Dt
  12. HADOOP_ROOT_LOGGER=INFO,DRFA
  13. AWS_CLOUDWATCH_HOME=/opt/aws/apitools/mon
  14. UPSTART_JOB=rc
  15. MAIL=/var/spool/mail/hadoop
  16. EC2_AMITOOL_HOME=/opt/aws/amitools/ec2
  17. PWD=/
  18. HOSTNAME=ip-172-31-21-173
  19. LESS_TERMCAP_se=[0m
  20. LOGNAME=hadoop
  21. UPSTART_INSTANCE=
  22. AWS_PATH=/opt/aws
  23. LESS_TERMCAP_mb=[01;31m
  24. _=/etc/alternatives/jre/bin/java
  25. LESS_TERMCAP_me=[0m
  26. NLSPATH=/usr/dt/lib/nls/msg/%L/%N.cat
  27. LESS_TERMCAP_md=[01;38;5;208m
  28. runlevel=3
  29. AWS_AUTO_SCALING_HOME=/opt/aws/apitools/as
  30. UPSTART_EVENTS=runlevel
  31. HISTSIZE=1000
  32. previous=N
  33. HADOOP_LOGFILE=syslog
  34. PATH=/sbin:/usr/sbin:/bin:/usr/bin:/usr/local/sbin:/opt/aws/bin
  35. EC2_HOME=/opt/aws/apitools/ec2
  36. HADOOP_LOG_DIR=/mnt/var/log/hadoop/steps/s-2SKUUYYPQ4KKK
  37. LESS_TERMCAP_ue=[0m
  38. AWS_ELB_HOME=/opt/aws/apitools/elb
  39. RUNLEVEL=3
  40. USER=hadoop
  41. HADOOP_CLIENT_OPTS=-Djava.io.tmpdir=/mnt/var/lib/hadoop/steps/s-2SKUUYYPQ4KKK/tmp
  42. PREVLEVEL=N
  43. HOME=/home/hadoop
  44. HISTCONTROL=ignoredups
  45. LESSOPEN=||/usr/bin/lesspipe.sh %s
  46. AWS_DEFAULT_REGION=eu-west-1
  47. LANG=en_US.UTF-8
  48. LESS_TERMCAP_us=[04;38;5;111m
  49. INFO redirectOutput to /mnt/var/log/hadoop/steps/s-2SKUUYYPQ4KKK/stdout
  50. INFO redirectError to /mnt/var/log/hadoop/steps/s-2SKUUYYPQ4KKK/stderr
  51. INFO Working dir /mnt/var/lib/hadoop/steps/s-2SKUUYYPQ4KKK
  52. INFO ProcessRunner started child process 7836 :
  53. hadoop 7836 2229 0 15:06 ? 00:00:00 bash /usr/lib/hadoop/bin/hadoop jar /var/lib/aws/emr/step-runner/hadoop-jars/command-runner.jar spark-submit --src=s3n://buket-name/output-files-hive/* --dest=s3n://buket-name/files-hive/test/
  54. 2016-07-13T15:06:31.724Z INFO HadoopJarStepRunner.Runner: startRun() called for s-2SKUUYYPQ4KKK Child Pid: 7836
  55. INFO Synchronously wait child process to complete : hadoop jar /var/lib/aws/emr/step-runner/hadoop-...
  56. INFO waitProcessCompletion ended with exit code 1 : hadoop jar /var/lib/aws/emr/step-runner/hadoop-...
  57. INFO total process run time: 2 seconds
  58. 2016-07-13T15:06:31.991Z INFO Step created jobs:
  59. 2016-07-13T15:06:31.992Z WARN Step failed with exitCode 1 and took 2 seconds
vecaoik1

vecaoik11#

在新版本的emr amazon中,不再需要包含.jar文件define s3distcp。

  1. aws emr add-steps --cluster-id j-XXXXXX --steps Name="S3DistCp step V3",Jar="command-runner.jar",Args=["s3-dist-cp","--src=s3n://buket-name/output-files-hive/","--dest=s3n://buket-name/files-hive/test/"]

相关问题