为什么mesos slave要求在重启slave上终止一个任务超过2小时?
我正在运行一个mesos集群,目前在云环境中有三个主节点和四个从节点。
mesos版本:0.28。
马拉松版本:0.15.2
我发现在这种情况下,如果我重新启动一个有docker任务运行的从机。重新启动后,任务将在该从属服务器上处于暂存状态超过2小时。2小时后,马拉松可以在另一个“奴隶”上启动任务。
如果检查日志,我可以看到它停留在“要求杀死任务”和“忽略杀死任务”约2小时。
有人知道为什么mesos需要在2个多小时内杀死死亡任务吗?
重新启动后的日志:
May 11 10:12:18 euca-10-254-234-236 mesos-slave[824]: I0511 10:12:18.199795 964 slave.cpp:1891] Asked to kill task project-hub_project-hub-backend.e764cc0d-173f-11e6-b66e-d00dacb0c46b of framework 17cd3756-1d59-4dfc-984d-3fe09f6b5730-0000
May 11 10:12:18 euca-10-254-234-236 mesos-slave[824]: W0511 10:12:18.199831 964 slave.cpp:2018] Ignoring kill task project-hub_project-hub-backend.e764cc0d-173f-11e6-b66e-d00dacb0c46b because the executor 'project-hub_project-hub-backend.e764cc0d-173f-11e6-b66e-d00dacb0c46b' of framework 17cd3756-1d59-4dfc-984d-3fe09f6b5730-0000 is terminating/terminated
May 11 10:12:18 euca-10-254-234-236 mesos-slave[824]: I0511 10:12:18.199872 964 slave.cpp:1891] Asked to kill task docker-registry.d1c20255-173f-11e6-b66e-d00dacb0c46b of framework 17cd3756-1d59-4dfc-984d-3fe09f6b5730-0000
2小时后登录:
I0511 12:15:48.200348 963 slave.cpp:1891] Asked to kill task project-hub_project-hub-backend.e764cc0d-173f-11e6-b66e-d00dacb0c46b of framework 17cd3756-1d59-4dfc-984d-3fe09f6b5730-0000
W0511 12:15:48.200409 963 slave.cpp:2018] Ignoring kill task project-hub_project-hub-backend.e764cc0d-173f-11e6-b66e-d00dacb0c46b because the executor 'project-hub_project-hub-backend.e764cc0d-173f-11e6-b66e-d00dacb0c46b' of framework 17cd3756-1d59-4dfc-984d-3fe09f6b5730-0000 is terminating/terminated
I0511 12:15:48.200429 963 slave.cpp:1891] Asked to kill task docker-registry.d1c20255-173f-11e6-b66e-d00dacb0c46b of framework 17cd3756-1d59-4dfc-984d-3fe09f6b5730-0000
W0511 12:15:48.200438 963 slave.cpp:2018] Ignoring kill task docker-registry.d1c20255-173f-11e6-b66e-d00dacb0c46b because the executor 'docker-registry.d1c20255-173f-11e6-b66e-d00dacb0c46b' of framework 17cd3756-1d59-4dfc-984d-3fe09f6b5730-0000 is terminating/terminated
I0511 12:15:51.485391 964 http.cpp:190] HTTP GET for /slave(1)/state from 10.145.150.124:59955 with User-Agent='Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.94 Safari/537.36'
I0511 12:15:51.509351 965 http.cpp:190] HTTP GET for /slave(1)/state from 10.145.150.124:59955 with User-Agent='Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.94 Safari/537.36'
W0511 12:15:51.656379 960 slave.cpp:4979] Failed to get resource statistics for executor 'project-hub_project-hub-backend.e764cc0d-173f-11e6-b66e-d00dacb0c46b' of framework 17cd3756-1d59-4dfc-984d-3fe09f6b5730-0000: Unknown container: b2f5b385-444b-4174-9a1c-8ccd2d3184dc
W0511 12:15:51.656409 960 slave.cpp:4979] Failed to get resource statistics for executor 'docker-registry.d1c20255-173f-11e6-b66e-d00dacb0c46b' of framework 17cd3756-1d59-4dfc-984d-3fe09f6b5730-0000: Unknown container: 1ab25a1b-79fe-430b-9751-330586a1fbef
I0511 12:15:51.663321 965 http.cpp:190] HTTP GET for /slave(1)/state from 10.145.150.124:59560 with User-Agent='Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.94 Safari/537.36'
I0511 12:15:51.671294 965 http.cpp:190] HTTP GET for /slave(1)/state from 10.145.150.124:59560 with User-Agent='Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.94 Safari/537.36'
W0511 12:15:52.156903 962 slave.cpp:4979] Failed to get resource statistics for executor 'project-hub_project-hub-backend.e764cc0d-173f-11e6-b66e-d00dacb0c46b' of framework 17cd3756-1d59-4dfc-984d-3fe09f6b5730-0000: Unknown container: b2f5b385-444b-4174-9a1c-8ccd2d3184dc
W0511 12:15:52.156941 962 slave.cpp:4979] Failed to get resource statistics for executor 'docker-registry.d1c20255-173f-11e6-b66e-d00dacb0c46b' of framework 17cd3756-1d59-4dfc-984d-3fe09f6b5730-0000: Unknown container: 1ab25a1b-79fe-430b-9751-330586a1fbef
E0511 12:15:52.247448 962 slave.cpp:3773] Container '1ab25a1b-79fe-430b-9751-330586a1fbef' for executor 'docker-registry.d1c20255-173f-11e6-b66e-d00dacb0c46b' of framework 17cd3756-1d59-4dfc-984d-3fe09f6b5730-0000 failed to start: future discarded
E0511 12:15:52.247612 962 slave.cpp:3773] Container 'b2f5b385-444b-4174-9a1c-8ccd2d3184dc' for executor 'project-hub_project-hub-backend.e764cc0d-173f-11e6-b66e-d00dacb0c46b' of framework 17cd3756-1d59-4dfc-984d-3fe09f6b5730-0000 failed to start: future discarded
W0511 12:15:52.247642 962 composing.cpp:541] Container '1ab25a1b-79fe-430b-9751-330586a1fbef' is already destroyed
W0511 12:15:52.247660 962 composing.cpp:541] Container 'b2f5b385-444b-4174-9a1c-8ccd2d3184dc' is already destroyed
E0511 12:15:52.247704 962 slave.cpp:3870] Termination of executor 'docker-registry.d1c20255-173f-11e6-b66e-d00dacb0c46b' of framework 17cd3756-1d59-4dfc-984d-3fe09f6b5730-0000 failed: Unknown container: 1ab25a1b-79fe-430b-9751-330586a1fbef
I0511 12:15:52.248374 962 slave.cpp:3002] Handling status update TASK_FAILED (UUID: b399e8ce-832c-4b06-a15f-3c155536b872) for task docker-registry.d1c20255-173f-11e6-b66e-d00dacb0c46b of framework 17cd3756-1d59-4dfc-984d-3fe09f6b5730-0000 from @0.0.0.0:0
E0511 12:15:52.248458 962 slave.cpp:3870] Termination of executor 'project-hub_project-hub-backend.e764cc0d-173f-11e6-b66e-d00dacb0c46b' of framework 17cd3756-1d59-4dfc-984d-3fe09f6b5730-0000 failed: Unknown container: b2f5b385-444b-4174-9a1c-8ccd2d3184dc
暂无答案!
目前还没有任何答案,快来回答吧!