在学习sqoop的过程中,我执行了一个sqoop命令来获取cloudera的dh中的所有mysql数据库,它会正确地返回所有可用的数据库。问题是,如果我在oozie工作流中运行与作业相同的命令,它总是会失败。
作业属性
nameNode=hdfs://quickstart.cloudera:8020
resourceManager=0.0.0.0:8032
oozie.use.system.libpath=true
oozie.wf.application.path=${nameNode}/user/${user.name}/oozie/pig_demo
工作流.xml
<workflow-app name="foo-wf" xmlns="uri:oozie:workflow:0.2">
<start to="sqoop-36c5"/>
<action name="sqoop-36c5">
<sqoop xmlns="uri:oozie:sqoop-action:0.2">
<job-tracker>${resourceManager}</job-tracker>
<name-node>${nameNode}</name-node>
<command>list-databases --m 1
--connect "jdbc:mysql://quickstart.cloudera:3306"
--username retail_dba
--password cloudera</command>
</sqoop>
<ok to="finish"/>
<error to="errorHalt"/>
</action>
<kill name="errorHalt">
<message>Input unavailable,error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="finish"/>
</workflow-app>
以下是生成的日志
2019-01-24 09:52:09,352 INFO [Thread-69] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Moved tmp to done: hdfs://quickstart.cloudera:8020/tmp/hadoop-yarn/staging/history/done_intermediate/cloudera/job_1548311302916_0001-1548312666604-cloudera-oozie%3Alauncher%3AT%3Dsqoop%3AW%3Dfoo%2Dwf%3AA%3Dsqoop%2D36c5%3AID%3D00-1548312729060-1-0-SUCCEEDED-root.cloudera-1548312703464.jhist_tmp to hdfs://quickstart.cloudera:8020/tmp/hadoop-yarn/staging/history/done_intermediate/cloudera/job_1548311302916_0001-1548312666604-cloudera-oozie%3Alauncher%3AT%3Dsqoop%3AW%3Dfoo%2Dwf%3AA%3Dsqoop%2D36c5%3AID%3D00-1548312729060-1-0-SUCCEEDED-root.cloudera-1548312703464.jhist
2019-01-24 09:52:09,352 INFO [Thread-69] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Stopped JobHistoryEventHandler. super.stop()
2019-01-24 09:52:09,353 INFO [Thread-69] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: KILLING attempt_1548311302916_0001_m_000000_0
2019-01-24 09:52:09,437 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1548311302916_0001_m_000000_0 TaskAttempt Transitioned from SUCCESS_FINISHING_CONTAINER to SUCCEEDED
2019-01-24 09:52:09,441 INFO [Thread-69] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Setting job diagnostics to
2019-01-24 09:52:09,442 INFO [Thread-69] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: History url is http://quickstart.cloudera:19888/jobhistory/job/job_1548311302916_0001
2019-01-24 09:52:09,480 INFO [Thread-69] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Waiting for application to be successfully unregistered.
2019-01-24 09:52:10,483 INFO [Thread-69] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Final Stats: PendingReds:0 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:1 AssignedReds:0 CompletedMaps:1 CompletedReds:0 ContAlloc:1 ContRel:0 HostLocal:0 RackLocal:0
2019-01-24 09:52:10,488 INFO [Thread-69] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Deleting staging directory hdfs://quickstart.cloudera:8020 /tmp/hadoop-yarn/staging/cloudera/.staging/job_1548311302916_0001
2019-01-24 09:52:10,505 INFO [Thread-69] org.apache.hadoop.ipc.Server: Stopping server on 34049
2019-01-24 09:52:10,517 INFO [IPC Server listener on 34049] org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 34049
2019-01-24 09:52:10,523 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2019-01-24 09:52:10,524 INFO [TaskHeartbeatHandler PingChecker] org.apache.hadoop.mapreduce.v2.app.TaskHeartbeatHandler: TaskHeartbeatHandler thread interrupted
2019-01-24 09:52:10,531 INFO [Ping Checker] org.apache.hadoop.yarn.util.AbstractLivelinessMonitor: TaskAttemptFinishingMonitor thread interrupted
2019-01-24 09:52:10,556 INFO [Thread-69] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Job end notification started for jobID : job_1548311302916_0001
2019-01-24 09:52:10,560 INFO [Thread-69] org.mortbay.log: Job end notification attempts left 0
2019-01-24 09:52:10,560 INFO [Thread-69] org.mortbay.log: Job end notification trying http://quickstart.cloudera:11000/oozie/callback?id=0000000-190124093049946-oozie-oozi-W@sqoop-36c5&status=SUCCEEDED
2019-01-24 09:52:10,590 INFO [Thread-69] org.mortbay.log: Job end notification to http://quickstart.cloudera:11000/oozie/callback?id=0000000-190124093049946-oozie-oozi-W@sqoop-36c5&status=SUCCEEDED succeeded
2019-01-24 09:52:10,590 INFO [Thread-69] org.mortbay.log: Job end notification succeeded for job_1548311302916_0001
2019-01-24 09:52:15,605 INFO [Thread-69] org.apache.hadoop.ipc.Server: Stopping server on 44688
2019-01-24 09:52:15,613 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2019-01-24 09:52:15,617 INFO [IPC Server listener on 44688] org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 44688
2019-01-24 09:52:15,637 INFO [Thread-69] org.mortbay.log: Stopped HttpServer2$SelectChannelConnectorWithSafeStartup@0.0.0.0:0
sqoop作业成功,但taskattempt从success\u finishing\u container转换到succeeded的操作被终止,这是为什么呢。
[cloudera@quickstart ~]$ yarn version
Hadoop 2.6.0-cdh5.13.0
Subversion http://github.com/cloudera/hadoop -r 42e8860b182e55321bd5f5605264da4adc8882be
Compiled by jenkins on 2017-10-04T18:08Z
Compiled with protoc 2.5.0
From source with checksum 5e84c185f8a22158e2b0e4b8f85311
This command was run using /usr/lib/hadoop/hadoop-common-2.6.0-cdh5.13.0.jar
1条答案
按热度按时间nr9pn0ug1#
首先,您正在运行一个listing数据库命令。不知道为什么要把它放在workflow.xml文件中。
我经历过clouderavm的行为不太一致,因为我们通常使用有限的内存来运行它,或者没有分配容器,或者容器被杀死。重新启动整个虚拟机也无济于事。
如果您再次获得一个带有该映像的cloudera vm的新示例并尝试运行它,它可能会解决您的问题。过去对我们有用。