marathon应用程序部署陷入等待状态

nxagd54h  于 2021-06-26  发布在  Mesos
关注(0)|答案(1)|浏览(510)

我有一个3节点设置运行马拉松,mesos主,mesos从和zookeeper与ha配置启用,然后测试了一个简单的hello应用程序部署使用mesos执行和它的工作预期。
现在一切看起来都很好,所以我连接到marathon并部署了一个简单的应用程序来测试marathon:(echo“hello”>>/tmp/output.txt),但是应用程序会陷入“等待”状态。
使用mesos资源进行部署会有什么问题?
来自mesos master的日志:

I0904 11:23:27.064332 19769 master.cpp:2813] Received SUBSCRIBE call for framework 'marathon' at scheduler-0340362b-0bb6-4fb8-8501-118d976e2cbd@192.168.40.156:36324
I0904 11:23:27.064623 19769 master.cpp:2890] Subscribing framework marathon with checkpointing enabled and capabilities [ PARTITION_AWARE ]
I0904 11:23:27.064669 19769 master.cpp:6272] Updating info for framework cb16118a-2257-4020-a907-63aa6294e11b-0000
I0904 11:23:27.064697 19769 master.cpp:2994] Framework cb16118a-2257-4020-a907-63aa6294e11b-0000 (marathon) at scheduler-0340362b-0bb6-4fb8-8501-118d976e2cbd@192.168.40.156:36324 failed over
I0904 11:23:27.065032 19770 hierarchical.cpp:342] Activated framework cb16118a-2257-4020-a907-63aa6294e11b-0000
I0904 11:23:27.065465 19770 master.cpp:7305] Sending 3 offers to framework cb16118a-2257-4020-a907-63aa6294e11b-0000 (marathon) at scheduler-0340362b-0bb6-4fb8-8501-118d976e2cbd@192.168.40.156:36324
I0904 11:23:27.907865 19769 http.cpp:1115] HTTP GET for /files/read?_=1504517007920&jsonp=jQuery17109098185077823333_1504516979864&length=50000&offset=352538&path=%2Fmaster%2Flog from 192.168.40.1:53525 with User-Agent='Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36'
I0904 11:23:28.916651 19768 http.cpp:1115] HTTP GET for /files/read?_=1504517008930&jsonp=jQuery17109098185077823333_1504516979865&length=50000&offset=353797&path=%2Fmaster%2Flog from 192.168.40.1:53525 with User-Agent='Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36'
E0904 11:23:30.071293 19775 process.cpp:2450] Failed to shutdown socket with fd 39, address 192.168.40.159:58072: Transport endpoint is not connected
I0904 11:23:30.073277 19768 master.cpp:1430] Framework cb16118a-2257-4020-a907-63aa6294e11b-0000 (marathon) at scheduler-0340362b-0bb6-4fb8-8501-118d976e2cbd@192.168.40.156:36324 disconnected
I0904 11:23:30.073307 19768 master.cpp:3160] Deactivating framework cb16118a-2257-4020-a907-63aa6294e11b-0000 (marathon) at scheduler-0340362b-0bb6-4fb8-8501-118d976e2cbd@192.168.40.156:36324
I0904 11:23:30.073485 19768 master.cpp:3137] Disconnecting framework cb16118a-2257-4020-a907-63aa6294e11b-0000 (marathon) at scheduler-0340362b-0bb6-4fb8-8501-118d976e2cbd@192.168.40.156:36324
I0904 11:23:30.073496 19768 master.cpp:1445] Giving framework cb16118a-2257-4020-a907-63aa6294e11b-0000 (marathon) at scheduler-0340362b-0bb6-4fb8-8501-118d976e2cbd@192.168.40.156:36324 1weeks to failover
I0904 11:23:30.073519 19768 hierarchical.cpp:374] Deactivated framework cb16118a-2257-4020-a907-63aa6294e11b-0000

curl -xget'http://mesosphere2:8098/v2/queue?漂亮的jq

{
  "queue": [
    {
      "count": 1,
      "delay": {
        "timeLeftSeconds": 0,
        "overdue": true
      },
      "since": "2017-09-04T13:12:42.024Z",
      "processedOffersSummary": {
        "processedOffersCount": 12,
        "unusedOffersCount": 12,
        "lastUnusedOfferAt": "2017-09-04T13:14:52.554Z",
        "rejectSummaryLastOffers": [
          {
            "reason": "UnfulfilledRole",
            "declined": 3,
            "processed": 3
          },
          {
            "reason": "UnfulfilledConstraint",
            "declined": 0,
            "processed": 0
          },
          {
            "reason": "NoCorrespondingReservationFound",
            "declined": 0,
            "processed": 0
          },
          {
            "reason": "InsufficientCpus",
            "declined": 0,
            "processed": 0
          },
          {
            "reason": "InsufficientMemory",
            "declined": 0,
            "processed": 0
          },
          {
            "reason": "InsufficientDisk",
            "declined": 0,
            "processed": 0
          },
          {
            "reason": "InsufficientGpus",
            "declined": 0,
            "processed": 0
          },
          {
            "reason": "InsufficientPorts",
            "declined": 0,
            "processed": 0
          }
        ],
        "rejectSummaryLaunchAttempt": [
          {
            "reason": "UnfulfilledRole",
            "declined": 12,
            "processed": 12
          },
          {
            "reason": "UnfulfilledConstraint",
            "declined": 0,
            "processed": 0
          },
          {
            "reason": "NoCorrespondingReservationFound",
            "declined": 0,
            "processed": 0
          },
          {
            "reason": "InsufficientCpus",
            "declined": 0,
            "processed": 0
          },
          {
            "reason": "InsufficientMemory",
            "declined": 0,
            "processed": 0
          },
          {
            "reason": "InsufficientDisk",
            "declined": 0,
            "processed": 0
          },
          {
            "reason": "InsufficientGpus",
            "declined": 0,
            "processed": 0
          },
          {
            "reason": "InsufficientPorts",
            "declined": 0,
            "processed": 0
          }
        ]
      },
      "app": {
        "id": "/test03",
        "acceptedResourceRoles": [
          "slave_public"
        ],
        "backoffFactor": 1.15,
        "backoffSeconds": 1,
        "container": {
          "type": "DOCKER",
          "docker": {
            "forcePullImage": false,
            "image": "laghao/hello-marathon",
            "network": "BRIDGE",
            "parameters": [],
            "portMappings": [
              {
                "containerPort": 80,
                "hostPort": 80,
                "labels": {},
                "protocol": "tcp",
                "servicePort": 10003
              }
            ],
            "privileged": false
          },
          "volumes": []
        },
        "cpus": 0.1,
        "disk": 0,
        "executor": "",
        "instances": 1,
        "labels": {},
        "maxLaunchDelaySeconds": 3600,
        "mem": 64,
        "gpus": 0,
        "portDefinitions": [
          {
            "port": 10003,
            "name": "default",
            "protocol": "tcp"
          }
        ],
        "requirePorts": false,
        "upgradeStrategy": {
          "maximumOverCapacity": 1,
          "minimumHealthCapacity": 1
        },
        "version": "2017-09-04T13:12:41.993Z",
        "versionInfo": {
          "lastScalingAt": "2017-09-04T13:12:41.993Z",
          "lastConfigChangeAt": "2017-09-04T13:12:41.993Z"
        },
        "killSelection": "YOUNGEST_FIRST",
        "unreachableStrategy": {
          "inactiveAfterSeconds": 300,
          "expungeAfterSeconds": 600
        }
      }
    }
  ]
}
vom3gejh

vom3gejh1#

来自文档
应用程序永远处于“等待”状态这意味着marathon不会从mesos收到允许它启动此应用程序任务的“资源提供”。最简单的失败是集群中没有足够的可用资源,或者另一个框架将所有这些资源集中在一起。您可以检查mesos ui以获取可用资源。请注意,所需的资源(如cpu、mem、磁盘)必须全部在单个主机上可用。
如果您自己没有找到解决方案,并且创建了github问题,请将mesos/state endpoint的输出附加到bug报告中,以便我们可以检查可用的集群资源。
在您的案例中,应用程序角色要求和代理角色存在问题。你可以从 UnfulfilledRole .
Marathon1.4引入了有关部署停滞的信息。您可以查询 /v2/queue 并获得拒绝报价的统计数据。

相关问题