kubernetes - 需要在 k8s 中应用 alertmanager repo 中的 prometheus 规则
问题描述
有 2 个 gitlab 存储库:=> gitlab a => gitlab b
gitlab a - 包含prometheus和prometheus pushgateway的状态集和pod
gitlab b - 包含 alertmanager 服务、alermanager pod 和 prometheus 规则。
所有 pod 和容器都已启动并运行。我正在尝试将普罗米修斯规则应用于普罗米修斯状态集。普罗米修斯规则.png
需要将 Kind:prometheus 规则应用于有状态的 prometheus 集。有人可以帮忙吗?
应用规则 yaml :
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
labels:
prometheus: k8s
role: alert-rules
name: prometheus-k8s-rules
namespace: cmp-monitoring
spec:
groups:
- name: node-exporter.rules
rules:
- expr: |
count without (cpu) (
count without (mode) (
node_cpu_seconds_total{job="node-exporter"}
)
)
record: instance:node_num_cpu:sum
- expr: |
1 - avg without (cpu, mode) (
rate(node_cpu_seconds_total{job="node-exporter", mode="idle"}[1m])
)
record: instance:node_cpu_utilisation:rate1m
prometheus-statefulset
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: prometheus
labels:
app: prometheus
spec:
selector:
matchLabels:
app: prometheus
serviceName: prometheus
replicas: 1
template:
metadata:
labels:
app: prometheus
spec:
terminationGracePeriodSeconds: 10
containers:
- name: prometheus
image: prom/prometheus
imagePullPolicy: Always
ports:
- name: http
containerPort: 9090
volumeMounts:
- name: prometheus-config
mountPath: "/etc/prometheus/prometheus.yml"
subPath: prometheus.yml
- name: prometheus-data
mountPath: "/prometheus"
#- name: rules-general
# mountPath: "/etc/prometheus/prometheus.rules.yml"
# subPath: prometheus.rules.yml
livenessProbe:
httpGet:
path: /-/healthy
port: 9090
initialDelaySeconds: 120
periodSeconds: 40
successThreshold: 1
timeoutSeconds: 10
failureThreshold: 3
readinessProbe:
httpGet:
path: /-/healthy
port: 9090
initialDelaySeconds: 120
periodSeconds: 40
successThreshold: 1
timeoutSeconds: 10
failureThreshold: 3
securityContext:
fsGroup: 1000
volumes:
- name: prometheus-config
configMap:
name: prometheus-server-conf
#- name: rules-general
# configMap:
# name: prometheus-server-conf
volumeClaimTemplates:
- metadata:
name: prometheus-data
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: rbd-default
resources:
requests:
storage: 10Gi
解决方案
推荐阅读
- javascript - 您刷新页面的数组中的Vuejs数据为空
- r - 诊断错误:当 x 是矩阵时,无法指定 nrow 或 ncol
- javascript - 你如何计算javascript中两个时间戳之间的差异,不包括周末和晚上?
- javascript - 如何在 VueJS + Vuex 中处理多个 websocket 端点?
- python - 数据转换/映射的最佳方法
- node.js - 如何将ss的pid信息传递给kill
- python - 在 Keras 中显示内核内容的方法
- javascript - 如何从我的控制器发送 json 数据以查看?
- java - 如何找出特定端点是否在 Spring Boot 中启用了 keepalive?
- linux - 我可以在安装期间或之后通过命令行为基于 linux 的 Pycharm 分配 python 解释器吗?