我使用cAdvisor和Prometheus来监视docker容器。我使用docker-compose.yml
文件启动应用程序。
在cAdvisor文档中,我读到--enable_metrics
和--disable_metrics
标志可用于仅选择要监视的指标子集。然而,一旦我提供了这些标志中的任何一个,cAdvisor似乎只监视自己。
——ignore_containers=cadvisor标志也不起作用,所以我一定是做错了什么?
这是我的docker-compose文件:version: '3.2'
services:
prometheus:
image: prom/prometheus:latest
container_name: prometheus
ports:
- 9090:9090
command:
- --config.file=/etc/prometheus/prometheus.yml
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml:ro
depends_on:
- cadvisor
cadvisor:
image: gcr.io/cadvisor/cadvisor:latest
container_name: advisor
# also tried:
# command: "--enable_metrics=memory"
# command: --enable_metrics=memory
# and more...
command:
- --disable_metrics=accelerator,advtcp,app,cpu,cpuLoad,cpu_topology,cpuset,disk,diskIO,hugetlb,memory_numa,network,oom_event,percpu,perf_event,process,referenced_memory,resctrl,sched,tcp,udp
- --enable_metrics=cpuLoad,memory,network
- --ignore_containers=cadvisor
ports:
- 8080:8080
volumes:
- /:/rootfs:ro
- /var/run:/var/run:rw
- /sys:/sys:ro
- /var/lib/docker/:/var/lib/docker:to
... containers ...
编辑:
我尝试了用户e0031374建议的内容,并将cAdvisor版本固定为v0.47.1。但是,我注意到,只要我提供任何命令,容器就会以不健康的状态退出。没有这个命令,一切都运行得很好。
例如,当我添加
command:
- "--version"
docker ps -a
显示:
5c4092450466 gcr.io/cadvisor/cadvisor "/usr/bin/cadvisor -…" 24 seconds ago Exited (0) 16 seconds ago
docker container inspect 5c4092450466
show:
...
Created": "2023-04-20T15:57:15.491743018Z",
"Path": "/usr/bin/cadvisor",
"Args": [
"-logtostderr",
"--version"
],
"State": {
"Status": "exited",
"Running": false,
"Paused": false,
"Restarting": false,
"OOMKilled": false,
"Dead": false,
"Pid": 0,
"ExitCode": 0,
"Error": "",
"StartedAt": "2023-04-20T15:57:22.532921383Z",
"FinishedAt": "2023-04-20T15:57:22.726984392Z",
"Health": {
"Status": "unhealthy",
"FailingStreak": 0,
"Log": []
}
},
...
我在这里错过了什么?谢谢!
尝试运行带有运行时选项command: "--version"
的cadvisor来检查您正在使用的cadvisor映像的版本
cadvisor可能低于v0.41.0,即使指定了最新版本。如果是这样,您可能需要手动指定一个更新版本,如image: gcr.io/cadvisor/cadvisor:v0.47.0