prometheus-alertmanager - alertmanager 随机获取错误消息意外状态代码 422
问题描述
我已经从 community-helm chart(14.6.0) 部署了 prometheus,它正在运行 alertmanager,它显示不时出现的错误(模板问题),错误消息显示没有任何额外的用处。问题是我已经通过 amtool 重新测试了配置并且在配置中没有收到错误
level=error ts=2021-08-17T14:43:08.787Z caller=dispatch.go:309 component=dispatcher msg="Notify for alerts failed" num_alerts=2 err="opsgenie/opsgenie[0]: notify retry canceled due to unrecoverable error after 1 attempts: unexpected status code 422: {\"message\":\"Request body is not processable. Please check the errors.\",\"errors\":{\"message\":\"Message can not be empty.\"},\"took\":0.0,\"requestId\":\"38c37c18-5635-48bc-bb69-bda03e232cce\"}"
level=debug ts=2021-08-17T14:43:08.798Z caller=notify.go:685 component=dispatcher receiver=opsgenie integration=opsgenie[0] msg="Notify success" attempts=1
level=error ts=2021-08-17T14:43:08.804Z caller=dispatch.go:309 component=dispatcher msg="Notify for alerts failed" num_alerts=2 err="opsgenie/opsgenie[0]: notify retry canceled due to unrecoverable error after 1 attempts: unexpected status code 422: {\"message\":\"Request body is not processable. Please check the errors.\",\"errors\":{\"message\":\"Message can not be empty.\"},\"took\":0.001,\"requestId\":\"70d2ac84-3422-4fe6-9d8b-e601fdc37b25\"}"
监控正在工作并获得警报只是想了解如何翻译此错误.. 启用调试模式没有提供更多信息可能有什么问题。
警报管理器配置:
global: {}
receivers:
- name: opsgenie
opsgenie_configs:
- api_key: XXX
api_url: https://api.eu.opsgenie.com/
details:
Prometheus alert: ' {{ .CommonLabels.alertname }}, {{ .CommonLabels.namespace }}, {{ .CommonLabels.pod }}, {{ .CommonLabels.dimension_CacheClusterId }}, {{ .CommonLabels.dimension_DBInstanceIdentifier }}, {{ .CommonLabels.dimension_DBClusterIdentifier }}'
http_config: {}
message: '{{ .CommonAnnotations.message }}'
priority: '{{ if eq .CommonLabels.severity "critical" }}P2{{ else if eq .CommonLabels.severity "high" }}P3{{ else if eq .CommonLabels.severity "warning" }}P4{{ else }}P5{{ end }}'
send_resolved: true
tags: ' Prometheus, {{ .CommonLabels.namespace }}, {{ .CommonLabels.severity }}, {{ .CommonLabels.alertname }}, {{ .CommonLabels.pod }}, {{ .CommonLabels.kubernetes_node }}, {{ .CommonLabels.dimension_CacheClusterId }}, {{ .CommonLabels.dimension_DBInstanceIdentifier }}, {{ .CommonLabels.dimension_Cluster_Name }}, {{ .CommonLabels.dimension_DBClusterIdentifier }} '
- name: deadmansswitch
webhook_configs:
- http_config:
basic_auth:
password: XXX
send_resolved: true
url: https://api.eu.opsgenie.com/v2/heartbeats/prometheus-nonprod/ping
- name: blackhole
route:
group_by:
- alertname
- namespace
- kubernetes_node
- dimension_CacheClusterId
- dimension_DBInstanceIdentifier
- dimension_Cluster_Name
- dimension_DBClusterIdentifier
- server_name
group_interval: 5m
group_wait: 10s
receiver: opsgenie
repeat_interval: 5m
routes:
- group_interval: 1m
match:
alertname: DeadMansSwitch
receiver: deadmansswitch
repeat_interval: 1m
- match_re:
namespace: XXX
- match_re:
alertname: HighMemoryUsage|HighCPULoad|CPUThrottlingHigh
- match_re:
namespace: .+
receiver: blackhole
- group_by:
- instance
match:
alertname: PrometheusBlackboxEndpoints
- match_re:
alertname: .*
- match_re:
kubernetes_node: .*
- match_re:
dimension_CacheClusterId: .*
- match_re:
dimension_DBInstanceIdentifier: .*
- match_re:
dimension_Cluster_Name: .*
- match_re:
解决方案
推荐阅读
- javascript - 在 iFrame SRC 中更改频道名称
- google-maps-api-3 - 如何将 API 密钥应用于 Google Maps Tile 服务器 URL?
- java - MarkLogic Java API PlanBuilderBase.ExportablePlanBase
- google-apps-script - 使用 MailApp.sendEmail 谷歌表格防止文本换行?谷歌脚本
- java - 正则表达式中的通配符仅在停止词之前是贪婪的
- typescript - 确保静态键存在于另一个类上
- r - R shinydashboardplus 翻转框 - 如何删除图像
- javascript - 提取库到单独的文件
- ios - Xamarion.iOS statusBar BackgroundColor 不会改变
- r - install.packages("knitr") 导致错误 untar2: 文件上的不完整块和包 'knitr' 的安装具有非零退出状态