我在Prometheus操作符中有一个默认的警报规则,如下所示,
- alert: KubePodNotReady
annotations:
message: Pod {{`{{`}} $labels.namespace {{`}}`}}/{{`{{`}} $labels.pod {{`}}`}} has been in a non-ready state for longer than 15 minutes.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepodnotready
expr: |-
sum by (namespace, pod) (
max by(namespace, pod) (
kube_pod_status_phase{job="kube-state-metrics", namespace=~".*", phase=~"Pending|Unknown"}
) * on(namespace, pod) group_left(owner_kind) topk by(namespace, pod) (
1, max by(namespace, pod, owner_kind) (kube_pod_owner{owner_kind!="Job"})
)
) > 0
for: 15m
labels:
severity: warning
我希望警报显示吊舱的标签";teamname";。
我可以得到带有以下表达式的吊舱标签,
kube_pod_info * on(namespace, pod) group_left kube_pod_labels{label_teamname="example"}
kube_pod_info * on(namespace, pod) group_left(label_teamname) kube_pod_labels
但我不知道如何更新警报规则以显示标签。我只是尝试在不编辑表达式的情况下添加标签,
labels:
severity: warning
teamname: '{{ $labels.label_teamname }}'
但这并没有奏效。
是否需要更改表达式才能在警报中包含teamname?如果是,请建议如何更改以下表达式。。
expr: |-
sum by (namespace, pod) (
max by(namespace, pod) (
kube_pod_status_phase{job="kube-state-metrics", namespace=~".*", phase=~"Pending|Unknown"}
) * on(namespace, pod) group_left(owner_kind) topk by(namespace, pod) (
1, max by(namespace, pod, owner_kind) (kube_pod_owner{owner_kind!="Job"})
)
) > 0
这个表达式适用于我,
(sum by (namespace, pod) (
max by(namespace, pod) (
kube_pod_status_phase{job="kube-state-metrics", namespace=~".*", phase=~"Pending|Unknown"}
) * on(namespace, pod) group_left(owner_kind) topk by(namespace, pod) (
1, max by(namespace, pod, owner_kind) (kube_pod_owner{owner_kind!="Job"})
)
) > 0) * on(namespace, pod) group_left(label_teamname) kube_pod_labels