hadoop容器失败,甚至100%完成

li9yvcax  于 2021-05-27  发布在  Hadoop
关注(0)|答案(1)|浏览(786)

我已经安装了一个小型集群hadoop2.7、hbase 0.98和nutch2.3.1。我写了一个自定义作业,simple首先合并同一个域的文档,然后从缓存中获取每个域的url(来自缓存即列表),然后使用相应的键通过 datastore.get(url_key) 更新分数后,通过 context.write .
作业应该完成后,所有的文件都被处理,但我所观察到的是,每次尝试如果失败,由于超时和进度是100%完成显示。这是日志

attempt_1549963404554_0110_r_000001_1   100.00  FAILED  reduce > reduce node2:8042  logs    Thu Feb 21 20:50:43 +0500 2019  Fri Feb 22 02:11:44 +0500 2019  5hrs, 21mins, 0sec  AttemptID:attempt_1549963404554_0110_r_000001_1 Timed out after 1800 secs Container killed by the ApplicationMaster. Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143
attempt_1549963404554_0110_r_000001_3   100.00  FAILED  reduce > reduce node1:8042  logs    Fri Feb 22 04:39:08 +0500 2019  Fri Feb 22 07:25:44 +0500 2019  2hrs, 46mins, 35sec AttemptID:attempt_1549963404554_0110_r_000001_3 Timed out after 1800 secs Container killed by the ApplicationMaster. Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143
attempt_1549963404554_0110_r_000002_0   100.00  FAILED  reduce > reduce node3:8042  logs    Thu Feb 21 12:38:45 +0500 2019  Thu Feb 21 22:50:13 +0500 2019  10hrs, 11mins, 28sec    AttemptID:attempt_1549963404554_0110_r_000002_0 Timed out after 1800 secs Container killed by the ApplicationMaster. Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143

它是什么,也就是说,当一个尝试是100.00%完成,那么它应该被标记为成功。不幸的是,对于我的案例,除了超时之外,还有任何错误信息。如何调试这个问题?我的reducer被贴到了另一个问题apachenutch2.3.1 map reduce更新分数时超时

oyjwcjzk

oyjwcjzk1#

我注意到,在上述3个日志中,执行所需的时间有很大差异。请检查一下你正在执行的工作。

相关问题