Supervisord有时启动芹菜，有时不启动

我正在Kubernetes上部署我的flask api。启动容器时执行的命令如下：

supervisord -c /etc/supervisor/conf.d/celery.conf 
gunicorn wsgi:app --bind=0.0.0.0:5000 --workers 1 --threads 12 --log-level=warning --access-logfile /var/log/gunicorn-access.log --error-logfile /var/log/gunicorn-error.log

您可以在上面看到，我首先和主管一起启动芹菜，然后运行gunicorn服务器。celery.conf的内容：

[supervisord]
logfile = /tmp/supervisord.log
logfile_maxbytes = 50MB
logfile_backups=10
loglevel = info
pidfile = /tmp/supervisord.pid
nodaemon = false
minfds = 1024
minprocs = 200
umask = 022
identifier = supervisor
directory = /tmp
nocleanup = true
[program:celery]
directory = /mydir/app
command = celery -A celery_worker.celery worker --loglevel=debug

当登录到我的pod时，我可以看到有时启动芹菜的过程是有效的(pod 1中的示例(：

> more /tmp/supervisord.log
2021-06-08 18:19:46,460 CRIT Supervisor running as root (no user in config file)
2021-06-08 18:19:46,462 INFO daemonizing the supervisord process
2021-06-08 18:19:46,462 INFO set current directory: '/tmp'
2021-06-08 18:19:46,463 INFO supervisord started with pid 9
2021-06-08 18:19:47,469 INFO spawned: 'celery' with pid 15
2021-06-08 18:19:48,470 INFO success: celery entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)

有时不是(在吊舱2中(：

> more /tmp/supervisord.log
2021-06-08 18:19:42,979 CRIT Supervisor running as root (no user in config file)
2021-06-08 18:19:42,988 INFO daemonizing the supervisord process
2021-06-08 18:19:42,988 INFO set current directory: '/tmp'
2021-06-08 18:19:42,989 INFO supervisord started with pid 9
2021-06-08 18:19:43,992 INFO spawned: 'celery' with pid 11
2021-06-08 18:19:44,994 INFO success: celery entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
>>>> 2021-06-08 18:19:58,642 INFO exited: celery (exit status 2; expected) <<<<<HERE

在我的pod 1中，ps命令显示以下内容：

> ps aux | grep celery
root          9  0.0  0.0  55308 16376 ?        Ss   18:45   0:00 /usr/bin/python /usr/bin/supervisord -c         /etc/supervisor/conf.d/celery.conf
root         23  2.2  0.8 2343684 352940 ?      S    18:45   0:05 /usr/bin/python3 /usr/local/bin/celery -A celery_worker.celery worker --loglevel=debug
root         37  0.0  0.5 2341860 208716 ?      S    18:46   0:00 /usr/bin/python3 /usr/local/bin/celery -A celery_worker.celery worker --loglevel=debug
root         38  0.0  0.5 2341864 208716 ?      S    18:46   0:00 /usr/bin/python3 /usr/local/bin/celery -A celery_worker.celery worker --loglevel=debug
root         39  0.0  0.5 2341868 208716 ?      S    18:46   0:00 /usr/bin/python3 /usr/local/bin/celery -A celery_worker.celery worker --loglevel=debug
root         40  0.0  0.5 2341872 208724 ?      S    18:46   0:00 /usr/bin/python3 /usr/local/bin/celery -A celery_worker.celery worker --loglevel=debug
root         41  0.0  0.5 2341876 208728 ?      S    18:46   0:00 /usr/bin/python3 /usr/local/bin/celery -A celery_worker.celery worker --loglevel=debug
root         42  0.0  0.5 2341880 208728 ?      S    18:46   0:00 /usr/bin/python3 /usr/local/bin/celery -A celery_worker.celery worker --loglevel=debug
root         43  0.0  0.5 2341884 208736 ?      S    18:46   0:00 /usr/bin/python3 /usr/local/bin/celery -A celery_worker.celery worker --loglevel=debug
root         44  0.0  0.5 2342836 211384 ?      S    18:46   0:00 /usr/bin/python3 /usr/local/bin/celery -A celery_worker.celery worker --loglevel=debug

在我的pod 2中，我可以看到supervisord/celerie过程仍然存在，但我没有pod 1:中所有的/usr/local/bin/celerie进程

> ps aux | grep celery
root          9  0.0  0.0  55308 16296 ?        Ss   18:19   0:00 /usr/bin/python /usr/bin/supervisord -c /etc/supervisor/conf.d/celery.conf

这种行为并不总是一样的。有时，当吊舱重新启动时，两个成功地发射了芹菜，有时没有一个成功。在最后一个场景中，如果我向我的API发出一个请求，该请求应该启动一个celero任务，我可以在我的代理控制台(RabbitMQ(上看到一个任务被创建，但没有消息"；活动"；我的数据库表(我的芹菜任务的最终结果(什么也没写。

如果我在豆荚里手动启动芹菜：

celery -A celery_worker.celery worker --loglevel=debug

一切正常。

什么能解释这样的行为？

根据上面的注释，最好的解决方案是有两个容器，第一个容器有入口点gunicorn，另一个容器有celery celery-worker。如果第二个是与第一个相同的图像，那么它运行得非常好，并且我可以在Kubernetes上独立地缩放每个容器。唯一的问题是，代码源更难，每次我在第一个上进行代码更改时，我都必须在第二个上手动应用相同的更改，也许有更好的方法来解决代码源的这个特定问题。

相关内容

最新更新

热门标签：