您好,欢迎来到爱go旅游网。
搜索
您的当前位置:首页prometheus-告警规则

prometheus-告警规则

来源:爱go旅游网
prometheus-告警规则

Prometheus 服务正常安装之后在配置告警规则1,编辑⽂件告警规则

编辑⽂件:/alidata/prometheus/prometheus/rules/cpu_rule.yml (这个⽂件是配置的cpu使⽤率的告警⽂件,由这个⽂件来介绍告警规则的配置⽅式)⽂件内容如下

[root@webserver2 tmpl]# cat /alidata/prometheus/prometheus/rules/cpu_rule.ymlgroups:

- name: general.rules rules:

- alert: GO Warning CPU usage

expr: (100-(avg(irate(node_cpu_seconds_total{job=\"go_node_exporter\ for: 1m labels:

severity: warning annotations:

summary: \"CPU使⽤率过⾼\"

description: \"主机: {{ $labels.name }} CPU使⽤⼤于百分之80% (当前值:{{ $value }})\" - alert: PHP Warning CPU usage

expr: (100-(avg(irate(node_cpu_seconds_total{job=\"php_node_exporter\ for: 1m labels:

severity: warning annotations:

summary: \"CPU使⽤率过⾼\"

description: \"主机: {{ $labels.name }} CPU使⽤⼤于百分之80% (当前值:{{ $value }})\"

2、配置peometheus的主配置⽂件,将报警规则添加到配置⽂件中

[root@webserver2 tmpl]# cat /alidata/prometheus/prometheus/prometheus.ymlglobal:

scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute. evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.alerting:

alertmanagers: - static_configs:

- targets: ['127.0.0.1:9093']rule_files:

- \"/alidata/prometheus/prometheus/rules/disk_rule.yml\" - \"/alidata/prometheus/prometheus/rules/cpu_rule.yml\" - \"/alidata/prometheus/prometheus/rules/memory_rule.yml\" - \"/alidata/prometheus/prometheus/rules/node_up.yml\" - \"/alidata/prometheus/prometheus/rules/system_load.yml\"scrape_configs:

- job_name: 'prometheus' static_configs:

- targets: ['localhost:9090'] - job_name: 'pushgateway' scrape_interval: 30s

honor_labels: true #加上此配置exporter节点上传数据中的⼀些标签将不会被pushgateway节点的相同标签覆盖 static_configs:

- targets: ['34.193.83.103:9091'] labels:

instance: pushgateway

............... 省略后⾯的主配置⽂件,在prometheus的配置中讲解...............

3、检查并重新加载配置⽂件

[root@webserver2 prometheus]# pwd /alidata/prometheus/prometheus

[root@webserver2 prometheus]#./promtool check config prometheus.yml

4、启动prometheus,在控制台查看配置

在界⾯中点击 Status -> Rules 查看规则,以及状态等信息

5、 告警状态

Inactive:这⾥什么都没有发⽣。绿⾊状态

Pending:已触发阈值,但未满⾜告警持续时间(即rule中的for字段)黄⾊状态

Firing:已触发阈值且满⾜告警持续时间。警报发送到alertmanager,经过处理,发送给接受者,这样⽬的是多次判断失败才发告警,减少邮件。红⾊状态正常情况如下:

等待告警如下:

告警如下:

告警邮件的发送是有Alertmanager进⾏发送的,邮件内容如下,Alertmanager的详细配置请查看 《Prometheus-告警集成Alertmanager (⼀)》的介绍邮件内容:

告警邮件,告警主题、告警信息详情就是告警描述中的内容

恢复告警邮件提⽰:

因篇幅问题不能全部显示,请点此查看更多更全内容

Copyright © 2019- igat.cn 版权所有

违法及侵权请联系:TEL:199 1889 7713 E-MAIL:2724546146@qq.com

本站由北京市万商天勤律师事务所王兴未律师提供法律服务