当我创建新的EC2实例时,我使用可靠的动态库存来创建新的cloudwatch度量警报。到目前为止还不错:
- name: set AWS CloudWatch alarms
hosts: tag_env_production
vars:
alarm_slack: 'arn:aws:sns:123:metrics-alarms-slack'
tasks:
- name: "CPU > 70%"
ec2_metric_alarm:
state: present
name: "{{ ec2_tag_Name }}-CPU"
region: "{{ ec2_region }}"
dimensions:
InstanceId: '{{ ec2_id }}'
namespace: "AWS/EC2"
metric: CPUUtilization
statistic: Average
comparison: ">="
threshold: 70.0
unit: Percent
period: 300
evaluation_periods: 1
description: Triggered when CPU utilization is more than 70% for 5 minutes
alarm_actions: ['{{ alarm_slack }}']
when: ec2_tag_group == 'lazyservers'
执行如下:
ansible-playbook -v ec2_alarms.yml -i inventories/ec2/ec2.py
创建新实例后,我会删除旧实例(手动)。问题是,我需要删除连接到旧实例的现有度量的警报。
我是错过了什么,还是无法通过动态库存做到这一点?
我目前的想法是删除处于"正在终止"状态的实例的度量,但缺点是,如果我在这些实例终止后运行剧本,它们将不可见。
在删除实例之前,删除警报,尝试使用以下方法:
- name: delete alarm
ec2_metric_alarm:
state: absent
region: ap-southeast-2
name: "cpu-low"
metric: "CPUUtilization"
namespace: "AWS/EC2"
statistic: Average
comparison: "<="
threshold: 5.0
period: 300
evaluation_periods: 3
unit: "Percent"
description: "This will alarm when a bamboo slave's cpu usage average is lower than 5% for 15 minutes "
dimensions: {'InstanceId':'{{ instance_id }}'}
alarm_actions: ["action1","action2"]