Supervisord有时启动芹菜,有时不启动



我正在Kubernetes上部署我的flask api。启动容器时执行的命令如下:

supervisord -c /etc/supervisor/conf.d/celery.conf 
gunicorn wsgi:app --bind=0.0.0.0:5000 --workers 1 --threads 12 --log-level=warning --access-logfile /var/log/gunicorn-access.log --error-logfile /var/log/gunicorn-error.log

您可以在上面看到,我首先和主管一起启动芹菜,然后运行gunicorn服务器。celery.conf的内容:

[supervisord]
logfile = /tmp/supervisord.log
logfile_maxbytes = 50MB
logfile_backups=10
loglevel = info
pidfile = /tmp/supervisord.pid
nodaemon = false
minfds = 1024
minprocs = 200
umask = 022
identifier = supervisor
directory = /tmp
nocleanup = true
[program:celery]
directory = /mydir/app
command = celery -A celery_worker.celery worker --loglevel=debug

当登录到我的pod时,我可以看到有时启动芹菜的过程是有效的(pod 1中的示例(:

> more /tmp/supervisord.log
2021-06-08 18:19:46,460 CRIT Supervisor running as root (no user in config file)
2021-06-08 18:19:46,462 INFO daemonizing the supervisord process
2021-06-08 18:19:46,462 INFO set current directory: '/tmp'
2021-06-08 18:19:46,463 INFO supervisord started with pid 9
2021-06-08 18:19:47,469 INFO spawned: 'celery' with pid 15
2021-06-08 18:19:48,470 INFO success: celery entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)

有时不是(在吊舱2中(:

> more /tmp/supervisord.log
2021-06-08 18:19:42,979 CRIT Supervisor running as root (no user in config file)
2021-06-08 18:19:42,988 INFO daemonizing the supervisord process
2021-06-08 18:19:42,988 INFO set current directory: '/tmp'
2021-06-08 18:19:42,989 INFO supervisord started with pid 9
2021-06-08 18:19:43,992 INFO spawned: 'celery' with pid 11
2021-06-08 18:19:44,994 INFO success: celery entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
>>>> 2021-06-08 18:19:58,642 INFO exited: celery (exit status 2; expected) <<<<<HERE

在我的pod 1中,ps命令显示以下内容:

> ps aux | grep celery
root          9  0.0  0.0  55308 16376 ?        Ss   18:45   0:00 /usr/bin/python /usr/bin/supervisord -c         /etc/supervisor/conf.d/celery.conf
root         23  2.2  0.8 2343684 352940 ?      S    18:45   0:05 /usr/bin/python3 /usr/local/bin/celery -A celery_worker.celery worker --loglevel=debug
root         37  0.0  0.5 2341860 208716 ?      S    18:46   0:00 /usr/bin/python3 /usr/local/bin/celery -A celery_worker.celery worker --loglevel=debug
root         38  0.0  0.5 2341864 208716 ?      S    18:46   0:00 /usr/bin/python3 /usr/local/bin/celery -A celery_worker.celery worker --loglevel=debug
root         39  0.0  0.5 2341868 208716 ?      S    18:46   0:00 /usr/bin/python3 /usr/local/bin/celery -A celery_worker.celery worker --loglevel=debug
root         40  0.0  0.5 2341872 208724 ?      S    18:46   0:00 /usr/bin/python3 /usr/local/bin/celery -A celery_worker.celery worker --loglevel=debug
root         41  0.0  0.5 2341876 208728 ?      S    18:46   0:00 /usr/bin/python3 /usr/local/bin/celery -A celery_worker.celery worker --loglevel=debug
root         42  0.0  0.5 2341880 208728 ?      S    18:46   0:00 /usr/bin/python3 /usr/local/bin/celery -A celery_worker.celery worker --loglevel=debug
root         43  0.0  0.5 2341884 208736 ?      S    18:46   0:00 /usr/bin/python3 /usr/local/bin/celery -A celery_worker.celery worker --loglevel=debug
root         44  0.0  0.5 2342836 211384 ?      S    18:46   0:00 /usr/bin/python3 /usr/local/bin/celery -A celery_worker.celery worker --loglevel=debug    

在我的pod 2中,我可以看到supervisord/celerie过程仍然存在,但我没有pod 1:中所有的/usr/local/bin/celerie进程

> ps aux | grep celery
root          9  0.0  0.0  55308 16296 ?        Ss   18:19   0:00 /usr/bin/python /usr/bin/supervisord -c /etc/supervisor/conf.d/celery.conf

这种行为并不总是一样的。有时,当吊舱重新启动时,两个成功地发射了芹菜,有时没有一个成功。在最后一个场景中,如果我向我的API发出一个请求,该请求应该启动一个celero任务,我可以在我的代理控制台(RabbitMQ(上看到一个任务被创建,但没有消息";活动";我的数据库表(我的芹菜任务的最终结果(什么也没写。

如果我在豆荚里手动启动芹菜:

celery -A celery_worker.celery worker --loglevel=debug

一切正常。

什么能解释这样的行为?

根据上面的注释,最好的解决方案是有两个容器,第一个容器有入口点gunicorn,另一个容器有celery celery-worker。如果第二个是与第一个相同的图像,那么它运行得非常好,并且我可以在Kubernetes上独立地缩放每个容器。唯一的问题是,代码源更难,每次我在第一个上进行代码更改时,我都必须在第二个上手动应用相同的更改,也许有更好的方法来解决代码源的这个特定问题。

最新更新