公司kubernetes生产环境部署了kube-prometheus-release-0.3用于监控kubernetes集群状态,但是默认预置了告警规则,但是不能发送告警信息。本文着重介绍自己在公司环境实现alertmanager通过企业微信发送告警信息。具体实现方式的逻辑如下图:
实现方式:
1.查看部署的kube-prometheus
[root@k8s-master-03 kube-prometheus-release-0.3]# kubectl get pod -n monitoring
NAME READY STATUS RESTARTS AGE
alertmanager-main-0 2/2 Running 0 6h20m
alertmanager-main-1 2/2 Running 0 6h20m
alertmanager-main-2 2/2 Running 0 6h2m
grafana-77978cbbdc-x9qpp 1/1 Running 0 5d22h
kube-state-metrics-7f6d7b46b4-nnjrs 3/3 Running 0 5d22h
node-exporter-42hpk 2/2 Running 0 5d22h
node-exporter-5d99p 2/2 Running 0 5d22h
node-exporter-5fcd8 2/2 Running 0 5d22h
node-exporter-66mxt 2/2 Running 0 5d22h
node-exporter-6tcg6 2/2 Running 0 5d22h
node-exporter-8dkc2 2/2 Running 0 5d22h
node-exporter-8wrq5 2/2 Running 0 5d22h
node-exporter-9z778 2/2 Running 0 5d22h
node-exporter-b2lpm 2/2 Running 0 5d22h
node-exporter-dvfmw 2/2 Running 0 5d22h
node-exporter-f794p 2/2 Running 0 5d22h
node-exporter-frfzm 2/2 Running 0 5d22h
node-exporter-hffpg 2/2 Running 0 5d22h
node-exporter-hkhkh 2/2 Running 0 5d22h
node-exporter-jjszd 2/2 Running 0 5d22h
node-exporter-lgslx 2/2 Running 0 5d22h
node-exporter-nxdtj 2/2 Running 0 5d22h
node-exporter-q458q 2/2 Running 0 5d22h
node-exporter-r6mff 2/2 Running 0 5d22h
node-exporter-s9jw2 2/2 Running 0 5d22h
node-exporter-vfp24 2/2 Running 0 5d22h
node-exporter-w2q6g 2/2 Running 0 5d22h
node-exporter-xgmn5 2/2 Running 0 5d22h
prometheus-adapter-68698bc948-xtnvm 1/1 Running 0 5d22h
prometheus-k8s-0 3/3 Running 1 5d1h
prometheus-k8s-1 3/3 Running 0 5d2h
prometheus-operator-6685db5c6-4zwcn 1/1 Running 0 5d22h
2.在企业微信群聊创建机器人
3.创建webhook服务,用于转发alertmanager的告警消息到企业微信机器人
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
run: prometheus-webhook-qywx
name: prometheus-webhook-qywx
namespace: monitoring
spec:
selector:
matchLabels:
run: prometheus-webhook-qywx
template:
metadata:
labels:
run: prometheus-webhook-qywx
spec:
containers:
- args:
- --adapter=/app/prometheusalert/wx.js=/adapter/wx=https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key=c3578c16-1a8e-ssssdddd8888888 #注意变更这个地址,即企业微信机器人的webhook地址
image: registry.cn-hangzhou.aliyuncs.com/guyongquan/webhook-adapter
name: prometheus-webhook-dingtalk
ports:
- containerPort: 80
protocol: TCP
---
apiVersion: v1
kind: Service
metadata:
labels:
run: prometheus-webhook-qywx
name: prometheus-webhook-qywx
namespace: monitoring
spec:
ports:
- port: 8060
protocol: TCP
targetPort: 80
selector:
run: prometheus-webhook-qywx
type: ClusterIP
备注:
adapter=/app/prometheusalert/wx.js=/adapter/wx=https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key=c3578c16-1a8e-ssssdddd8888888 #注意变更这个地址,即企业微信机器人的webhook地址
4.报警方式,即alertmanager的配置文件变更
变更命名空间monitoring下的alertmanager-main的secret,这个是alertmanager的配置文件
默认安装的时候,缺省配置为
"global":
"resolve_timeout": "5m"
"receivers":
- "name": "null"
"route":
"group_by":
- "job"
"group_interval": "5m"
"group_wait": "30s"
"receiver": "null"
"repeat_interval": "12h"
"routes":
- "match":
"alertname": "Watchdog"
"receiver": "null"
根据需求变更为
"global":
"resolve_timeout": "5m"
"receivers":
- name: 'web.hook'
webhook_configs:
- url: 'http://prometheus-webhook-qywx.monitoring.svc.cluster.local:8060/adapter/wx' # 刚刚创建的webhook地址
send_resolved: false
"route":
"group_by":
- "job"
- "namespaces"
- "alertname"
"group_interval": "5m"
"group_wait": "30s"
"receiver": "web.hook"
"repeat_interval": "10m"
"routes":
- "match":
"alertname": "Watchdog"
"receiver": "web.hook"
http://prometheus-webhook-qywx.monitoring.svc.cluster.local:8060 是上面创建的webhook转发服务的service
配置企业微信机器人告警时需要先安装webhook;最后重启alertmanger-main-0,alertmanger-main-1,alertmanger-main-3
5.查看企业微信机器人的告警信息
文章来源:https://www.toymoban.com/news/detail-400379.html
文章来源地址https://www.toymoban.com/news/detail-400379.html
到了这里,关于kube-prometheus实现企业微信机器人告警的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!