我配置 Apache Flink 通过 conf/flink-conf.yaml
文件向 Prometheus 发送指标:
metrics.reporter.prom.class: org.apache.flink.metrics.prometheus.PrometheusReporter
metrics.reporter.prom.host: 192.168.56.1
metrics.reporter.prom.port: 9250-9260
然后我在文件上配置了普罗米修斯/etc/prometheus/prometheus.yml
:
global:
scrape_interval: 15s
scrape_configs:
- job_name: 'prometheus'
scrape_interval: 5s
static_configs:
- targets: ['localhost:9090']
- job_name: 'node_exporter'
scrape_interval: 5s
static_configs:
- targets: ['localhost:9100']
- job_name: 'flink'
scrape_interval: 5s
static_configs:
- targets: ['jobmanager:9250', 'taskmanager1:9251', 'taskmanager2:9252']
第一个任务管理器的日志说普罗米修斯已配置:
2019-03-29 17:04:57,347 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: metrics.reporter.prom.class, org.apache.flink.metrics.prometheus.PrometheusReporter
2019-03-29 17:04:57,348 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: metrics.reporter.prom.host, 192.168.56.1
2019-03-29 17:04:57,349 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: metrics.reporter.prom.port, 9250-9260
...
2019-03-29 17:04:59,463 INFO org.apache.flink.runtime.metrics.MetricRegistryImpl - Configuring prom with {port=9250-9260, host=192.168.56.1, class=org.apache.flink.metrics.prometheus.PrometheusReporter}.
2019-03-29 17:04:59,479 INFO org.apache.flink.metrics.prometheus.PrometheusReporter - Started PrometheusReporter HTTP server on port 9251.
2019-03-29 17:04:59,479 INFO org.apache.flink.runtime.metrics.MetricRegistryImpl - Reporting metrics for reporter prom of type org.apache.flink.metrics.prometheus.PrometheusReporter.
O 将 jar 文件flink-metrics-prometheus_2.11-1.7.2.jar
复制到我的 FLink 实例的两个节点的lib
目录中。我有一个 RichMapper 类,它公开了一个计数器和一个仪表。为什么我在普罗米修斯仪表板上看不到指标?
我使用此命令./bin/flink run -c org.sense.flink.App ../../../felipe/eclipse-workspace/explore-flink/target/explore-flink.jar 14 192.168.56.20 &
depmy 应用程序,并且我可以在其中一个任务管理器日志上看到输出。
public static class SensorTypeMapper
extends RichMapFunction<MqttSensor, Tuple2<CompositeKeySensorType, MqttSensor>> {
private static final long serialVersionUID = -4080196110995184486L;
private transient Counter counter;
private transient Meter meter;
@Override
public void open(Configuration config) {
this.counter = getRuntimeContext().getMetricGroup().counter("counterSensorTypeMapper");
com.codahale.metrics.Meter dropwizardMeter = new com.codahale.metrics.Meter();
this.meter = getRuntimeContext().getMetricGroup().meter("meterSensorTypeMapper",
new DropwizardMeterWrapper(dropwizardMeter));
}
@Override
public Tuple2<CompositeKeySensorType, MqttSensor> map(MqttSensor value) throws Exception {
this.meter.markEvent();
this.counter.inc();
// every sensor key: sensorId, sensorType, platformId, platformType, stationId
// Integer sensorId = value.getKey().f0;
String sensorType = value.getKey().f1;
Integer platformId = value.getKey().f2;
// String platformType = value.getKey().f3;
Integer stationId = value.getKey().f4;
CompositeKeySensorType compositeKey = new CompositeKeySensorType(stationId, platformId, sensorType);
return Tuple2.of(compositeKey, value);
}
}
我解决了。我只需要在文件/etc/prometheus/prometheus.yml
的 targets
属性上配置正确的主机名
global:
scrape_interval: 15s
scrape_configs:
- job_name: 'prometheus'
scrape_interval: 5s
static_configs:
- targets: ['localhost:9090']
- job_name: 'node_exporter'
scrape_interval: 5s
static_configs:
- targets: ['localhost:9100']
- job_name: 'flink'
scrape_interval: 5s
static_configs:
- targets: ['localhost:9250', 'localhost:9251', '192.168.56.20:9250']