kubernetes 作业启动后立即将Locus工作进程设置为“缺失

ippsafx7 于 2023-01-12 发布在 Kubernetes

关注(0)|答案(2)|浏览(136)

我在python 3.10上运行cocast locust==2.8.6。我通过AWS EKS在kubernetes上运行它。我分布式运行它，并尝试设置1个主节点和5个工作节点。
主pod以命令启动：

command: ["locust"]
        args: ["-f","$filename","--headless","--users=$clients","--spawn-rate=$hatch-rate","--run-time=$run-time","--only-summary","--master","--expect-workers=$num_slaves"]

工人们从命令开始

command: ["locust"]
        args: ["-f","$filename","--worker","--master-host=locust-master$task_id"]

实际上，在一个worker pod上，我可以运行telnet locust-master1 5557并确认通信（在这种情况下，$task_id=1）。
我在主pod中看到如下日志：

[2022-04-27 22:53:16,969] locust-master1--1-z2lr8/INFO/root: Waiting for workers to be ready, 0 of 5 connected
[2022-04-27 22:53:17,109] locust-master1--1-z2lr8/INFO/locust.runners: Client 'locust-slave1-tt7n5_fec1320a406b42319f3088bd9a7c181c' reported as ready. Currently 1 clients ready to swarm.
[2022-04-27 22:53:17,147] locust-master1--1-z2lr8/INFO/locust.runners: Client 'locust-slave1-qv7kt_011dbeb9f15d452f935c5643fb463632' reported as ready. Currently 2 clients ready to swarm.
[2022-04-27 22:53:17,261] locust-master1--1-z2lr8/INFO/locust.runners: Client 'locust-slave1-ks5wb_356fcf54ac2644e4badc684e3846520c' reported as ready. Currently 3 clients ready to swarm.
[2022-04-27 22:53:17,354] locust-master1--1-z2lr8/INFO/locust.runners: Client 'locust-slave1-cbkbd_2c90cedde5224e1e9cf47bbb543b9097' reported as ready. Currently 4 clients ready to swarm.
[2022-04-27 22:53:17,364] locust-master1--1-z2lr8/INFO/locust.runners: Client 'locust-slave1-xfvsz_196bba3928c5491e896acd411798d48d' reported as ready. Currently 5 clients ready to swarm.
[2022-04-27 22:53:17,970] locust-master1--1-z2lr8/INFO/locust.main: Run time limit set to 5400 seconds
[2022-04-27 22:53:17,971] locust-master1--1-z2lr8/INFO/locust.main: Starting Locust 2.8.6
[2022-04-27 22:53:17,971] locust-master1--1-z2lr8/INFO/locust.runners: Sending spawn jobs of 50 users at 0.50 spawn rate to 5 ready clients
[2022-04-27 22:53:17,977] locust-master1--1-z2lr8/INFO/locust_submit_judgments: Locust Startup: job_id: 1434194
[2022-04-27 22:53:18,376] locust-master1--1-z2lr8/INFO/locust.runners: Worker locust-slave1-cbkbd_2c90cedde5224e1e9cf47bbb543b9097 failed to send heartbeat, setting state to missing.
[2022-04-27 22:53:20,384] locust-master1--1-z2lr8/INFO/locust.runners: Worker locust-slave1-qv7kt_011dbeb9f15d452f935c5643fb463632 failed to send heartbeat, setting state to missing.
[2022-04-27 22:53:20,385] locust-master1--1-z2lr8/INFO/locust.runners: Worker locust-slave1-ks5wb_356fcf54ac2644e4badc684e3846520c failed to send heartbeat, setting state to missing.
[2022-04-27 22:53:22,391] locust-master1--1-z2lr8/INFO/locust.runners: Worker locust-slave1-tt7n5_fec1320a406b42319f3088bd9a7c181c failed to send heartbeat, setting state to missing.
[2022-04-27 22:53:22,391] locust-master1--1-z2lr8/INFO/locust.runners: Worker locust-slave1-xfvsz_196bba3928c5491e896acd411798d48d failed to send heartbeat, setting state to missing.
[2022-04-27 22:53:22,392] locust-master1--1-z2lr8/INFO/locust.runners: The last worker went missing, stopping test.
[2022-04-27 22:53:22,392] locust-master1--1-z2lr8/INFO/locust_submit_judgments: Locust Teardown: sending query messages to Results DB

所以我确实看到了worker注册了自己，但是测试一开始，master pod就说worker发送心跳失败，并将其设置为missing。如果我在没有--headless的情况下运行master pod，这意味着我可以打开Web UI并手动启动作业。手工启动作业时，显示相同的心跳消息。
在worker pod上，我看到了调试启动日志，但没有任何指示问题的内容。
我在网上找不到关于如何设置分布式蝗虫的指南（除了它被称为locustio和0.x版本的时候），从那时起，情况发生了很大变化。
这里需要设置什么？我不确定要包含哪些代码，而不包含许多行设置代码。我试图在postgres上测试，所以我想遵循https://docs.locust.io/en/stable/testing-other-systems.html，但在所有的例子中，它们都 Package 了属性，这与我继承的代码不同。

kubernetes

来源：https://stackoverflow.com/questions/72036196/locust-workers-set-to-missing-as-soon-as-job-starts

2条答案

按热度按时间

toiithl61#

您是否检查过CPU利用率？我们遇到过类似的情况，虚拟机的CPU消耗量为100，而工作人员根本无法发送心跳。

赞(0）回复(0）举报 2023-01-12

tjvv9vkg2#

取决于postgress测试的实现，您可能需要确保正确使用gevent。参见文档中的this注解：
您使用的协议库可以通过gevent进行monkey-patched，这一点很重要。
在我的例子中，我使用了Snowflake自定义测试类，由于请求被阻塞，也遇到了同样的问题，添加Monkey补丁修复了这个问题。

赞(0）回复(0）举报 2023-01-12

我来回答

kubernetes 作业启动后立即将Locus工作进程设置为“缺失

2条答案

相关问题

热门标签

最新问答