蝗虫工人将在工作开始时立即"missing"

我在python 3.10上运行locustlocust==2.8.6。我通过AWS EKS在kubernetes上运行它。我运行它的分布式，并试图设置1主和5工人。

主pod启动命令:

command: ["locust"]
args: ["-f","$filename","--headless","--users=$clients","--spawn-rate=$hatch-rate","--run-time=$run-time","--only-summary","--master","--expect-workers=$num_slaves"]

和工人开始命令:

command: ["locust"]
args: ["-f","$filename","--worker","--master-host=locust-master$task_id"]

实际上，在工作pod上，我可以运行telnet locust-master1 5557并确认通信。(在这种情况下，$task_id=1)

我在主pod中看到如下日志:

[2022-04-27 22:53:16,969] locust-master1--1-z2lr8/INFO/root: Waiting for workers to be ready, 0 of 5 connected
[2022-04-27 22:53:17,109] locust-master1--1-z2lr8/INFO/locust.runners: Client 'locust-slave1-tt7n5_fec1320a406b42319f3088bd9a7c181c' reported as ready. Currently 1 clients ready to swarm.
[2022-04-27 22:53:17,147] locust-master1--1-z2lr8/INFO/locust.runners: Client 'locust-slave1-qv7kt_011dbeb9f15d452f935c5643fb463632' reported as ready. Currently 2 clients ready to swarm.
[2022-04-27 22:53:17,261] locust-master1--1-z2lr8/INFO/locust.runners: Client 'locust-slave1-ks5wb_356fcf54ac2644e4badc684e3846520c' reported as ready. Currently 3 clients ready to swarm.
[2022-04-27 22:53:17,354] locust-master1--1-z2lr8/INFO/locust.runners: Client 'locust-slave1-cbkbd_2c90cedde5224e1e9cf47bbb543b9097' reported as ready. Currently 4 clients ready to swarm.
[2022-04-27 22:53:17,364] locust-master1--1-z2lr8/INFO/locust.runners: Client 'locust-slave1-xfvsz_196bba3928c5491e896acd411798d48d' reported as ready. Currently 5 clients ready to swarm.
[2022-04-27 22:53:17,970] locust-master1--1-z2lr8/INFO/locust.main: Run time limit set to 5400 seconds
[2022-04-27 22:53:17,971] locust-master1--1-z2lr8/INFO/locust.main: Starting Locust 2.8.6
[2022-04-27 22:53:17,971] locust-master1--1-z2lr8/INFO/locust.runners: Sending spawn jobs of 50 users at 0.50 spawn rate to 5 ready clients
[2022-04-27 22:53:17,977] locust-master1--1-z2lr8/INFO/locust_submit_judgments: Locust Startup: job_id: 1434194
[2022-04-27 22:53:18,376] locust-master1--1-z2lr8/INFO/locust.runners: Worker locust-slave1-cbkbd_2c90cedde5224e1e9cf47bbb543b9097 failed to send heartbeat, setting state to missing.
[2022-04-27 22:53:20,384] locust-master1--1-z2lr8/INFO/locust.runners: Worker locust-slave1-qv7kt_011dbeb9f15d452f935c5643fb463632 failed to send heartbeat, setting state to missing.
[2022-04-27 22:53:20,385] locust-master1--1-z2lr8/INFO/locust.runners: Worker locust-slave1-ks5wb_356fcf54ac2644e4badc684e3846520c failed to send heartbeat, setting state to missing.
[2022-04-27 22:53:22,391] locust-master1--1-z2lr8/INFO/locust.runners: Worker locust-slave1-tt7n5_fec1320a406b42319f3088bd9a7c181c failed to send heartbeat, setting state to missing.
[2022-04-27 22:53:22,391] locust-master1--1-z2lr8/INFO/locust.runners: Worker locust-slave1-xfvsz_196bba3928c5491e896acd411798d48d failed to send heartbeat, setting state to missing.
[2022-04-27 22:53:22,392] locust-master1--1-z2lr8/INFO/locust.runners: The last worker went missing, stopping test.
[2022-04-27 22:53:22,392] locust-master1--1-z2lr8/INFO/locust_submit_judgments: Locust Teardown: sending query messages to Results DB

所以我确实看到工人注册自己，但一旦测试开始，主pod说工人未能发送心跳并将其设置为失踪。如果我运行没有--headless的主pod，这意味着我可以打开web UI并手动启动作业。我看到了同样的问题:当我手动启动作业时，出现了相同的心跳消息。

在worker pods上，我看到我的调试启动日志，没有任何提示问题。

我在网上找不到关于如何设置分布式蝗虫的指南(除了当它被称为locustio和0.x版本时)，从那时起事情发生了很大变化。

这里需要设置什么?我不确定要包括哪些代码，而不包括许多行设置代码。我正试图测试对postgres，所以我在考虑以下https://docs.locust.io/en/stable/testing-other-systems.html，但在所有的例子中，他们都包装属性，这是从我继承的代码的偏离。

检查过CPU利用率了吗?我们有一个类似的情况，当VM的CPU消耗为100时，worker根本没有可能发送心跳。

取决于postgress test的实现，您可能需要确保您正确使用了gevent。请参阅文档中的注释:

重要的是，您使用的协议库可以通过gevent进行猴子补丁。

在我的例子中，我使用了Snowflake自定义测试类，由于请求被阻塞而遭受同样的问题。添加猴子补丁修复了这个问题。

相关内容

最新更新

热门标签：