kubernetes - prometheus 不会从 kubernetes 集群中的 treafik 服务中提取数据
问题描述
我正在使用 prometheus( quay.azk8s.cn/prometheus/prometheus:v2.15.2
) 来监控 kubernetesmonitoring
命名空间中的 traefik 2.1.6,现在我正在让 traefik 公开 metics,我可以使用 curl 命令从中获取配置http://traefik-ip:8080/metrics
,但是 prometheus 不提取数据。我已经在 treafik 服务 yaml 中添加了注释kuberneteskube-system
命名空间,这是 prometheus 服务配置:
{
"kind": "StatefulSet",
"apiVersion": "apps/v1beta2",
"metadata": {
"name": "prometheus-k8s",
"namespace": "monitoring",
"selfLink": "/apis/apps/v1beta2/namespaces/monitoring/statefulsets/prometheus-k8s",
"uid": "4190d704-aa3b-40da-ab99-bac3cb10f186",
"resourceVersion": "18281285",
"generation": 7,
"creationTimestamp": "2020-03-04T16:31:01Z",
"labels": {
"prometheus": "k8s"
},
"annotations": {
"prometheus-operator-input-hash": "4895445337133709592"
},
"ownerReferences": [
{
"apiVersion": "monitoring.coreos.com/v1",
"kind": "Prometheus",
"name": "k8s",
"uid": "ddf7e48d-f982-4881-9312-0d50466870a9",
"controller": true,
"blockOwnerDeletion": true
}
]
},
"spec": {
"replicas": 2,
"selector": {
"matchLabels": {
"app": "prometheus",
"prometheus": "k8s"
}
},
"template": {
"metadata": {
"creationTimestamp": null,
"labels": {
"app": "prometheus",
"prometheus": "k8s"
}
},
"spec": {
"volumes": [
{
"name": "config",
"secret": {
"secretName": "prometheus-k8s",
"defaultMode": 420
}
},
{
"name": "tls-assets",
"secret": {
"secretName": "prometheus-k8s-tls-assets",
"defaultMode": 420
}
},
{
"name": "config-out",
"emptyDir": {}
},
{
"name": "prometheus-k8s-rulefiles-0",
"configMap": {
"name": "prometheus-k8s-rulefiles-0",
"defaultMode": 420
}
},
{
"name": "prometheus-k8s-db",
"emptyDir": {}
}
],
"containers": [
{
"name": "prometheus",
"image": "quay.azk8s.cn/prometheus/prometheus:v2.15.2",
"args": [
"--web.console.templates=/etc/prometheus/consoles",
"--web.console.libraries=/etc/prometheus/console_libraries",
"--config.file=/etc/prometheus/config_out/prometheus.env.yaml",
"--storage.tsdb.path=/prometheus",
"--storage.tsdb.retention.time=24h",
"--web.enable-lifecycle",
"--storage.tsdb.no-lockfile",
"--web.route-prefix=/"
],
"ports": [
{
"name": "web",
"containerPort": 9090,
"protocol": "TCP"
}
],
"resources": {
"requests": {
"memory": "400Mi"
}
},
"volumeMounts": [
{
"name": "config-out",
"readOnly": true,
"mountPath": "/etc/prometheus/config_out"
},
{
"name": "tls-assets",
"readOnly": true,
"mountPath": "/etc/prometheus/certs"
},
{
"name": "prometheus-k8s-db",
"mountPath": "/prometheus"
},
{
"name": "prometheus-k8s-rulefiles-0",
"mountPath": "/etc/prometheus/rules/prometheus-k8s-rulefiles-0"
}
],
"livenessProbe": {
"httpGet": {
"path": "/-/healthy",
"port": "web",
"scheme": "HTTP"
},
"timeoutSeconds": 3,
"periodSeconds": 5,
"successThreshold": 1,
"failureThreshold": 6
},
"readinessProbe": {
"httpGet": {
"path": "/-/ready",
"port": "web",
"scheme": "HTTP"
},
"timeoutSeconds": 3,
"periodSeconds": 5,
"successThreshold": 1,
"failureThreshold": 120
},
"terminationMessagePath": "/dev/termination-log",
"terminationMessagePolicy": "FallbackToLogsOnError",
"imagePullPolicy": "IfNotPresent"
},
{
"name": "prometheus-config-reloader",
"image": "quay.azk8s.cn/coreos/prometheus-config-reloader:v0.37.0",
"command": [
"/bin/prometheus-config-reloader"
],
"args": [
"--log-format=logfmt",
"--reload-url=http://localhost:9090/-/reload",
"--config-file=/etc/prometheus/config/prometheus.yaml.gz",
"--config-envsubst-file=/etc/prometheus/config_out/prometheus.env.yaml"
],
"env": [
{
"name": "POD_NAME",
"valueFrom": {
"fieldRef": {
"apiVersion": "v1",
"fieldPath": "metadata.name"
}
}
}
],
"resources": {
"limits": {
"cpu": "100m",
"memory": "25Mi"
},
"requests": {
"cpu": "100m",
"memory": "25Mi"
}
},
"volumeMounts": [
{
"name": "config",
"mountPath": "/etc/prometheus/config"
},
{
"name": "config-out",
"mountPath": "/etc/prometheus/config_out"
}
],
"terminationMessagePath": "/dev/termination-log",
"terminationMessagePolicy": "FallbackToLogsOnError",
"imagePullPolicy": "IfNotPresent"
},
{
"name": "rules-configmap-reloader",
"image": "jimmidyson/configmap-reload:v0.3.0",
"args": [
"--webhook-url=http://localhost:9090/-/reload",
"--volume-dir=/etc/prometheus/rules/prometheus-k8s-rulefiles-0"
],
"resources": {
"limits": {
"cpu": "100m",
"memory": "25Mi"
},
"requests": {
"cpu": "100m",
"memory": "25Mi"
}
},
"volumeMounts": [
{
"name": "prometheus-k8s-rulefiles-0",
"mountPath": "/etc/prometheus/rules/prometheus-k8s-rulefiles-0"
}
],
"terminationMessagePath": "/dev/termination-log",
"terminationMessagePolicy": "FallbackToLogsOnError",
"imagePullPolicy": "IfNotPresent"
}
],
"restartPolicy": "Always",
"terminationGracePeriodSeconds": 600,
"dnsPolicy": "ClusterFirst",
"nodeSelector": {
"kubernetes.io/os": "linux"
},
"serviceAccountName": "prometheus-k8s",
"serviceAccount": "prometheus-k8s",
"securityContext": {
"runAsUser": 1000,
"runAsNonRoot": true,
"fsGroup": 2000
},
"schedulerName": "default-scheduler"
}
},
"serviceName": "prometheus-operated",
"podManagementPolicy": "Parallel",
"updateStrategy": {
"type": "RollingUpdate"
},
"revisionHistoryLimit": 10
},
"status": {
"observedGeneration": 7,
"replicas": 2,
"readyReplicas": 2,
"currentReplicas": 2,
"updatedReplicas": 2,
"currentRevision": "prometheus-k8s-6f76f69569",
"updateRevision": "prometheus-k8s-6f76f69569",
"collisionCount": 0
}
}
this is the config:
{
"kind": "Service",
"apiVersion": "v1",
"metadata": {
"name": "traefik",
"namespace": "kube-system",
"selfLink": "/api/v1/namespaces/kube-system/services/traefik",
"uid": "b2695279-2467-4480-aab5-a720a43951c1",
"resourceVersion": "18280221",
"creationTimestamp": "2020-01-29T10:26:34Z",
"annotations": {
"kubectl.kubernetes.io/last-applied-configuration": "{\"apiVersion\":\"v1\",\"kind\":\"Service\",\"metadata\":{\"annotations\":{\"prometheus.io/port\":\"8080\",\"prometheus.io/scrape\":\"true\"},\"name\":\"traefik\",\"namespace\":\"kube-system\"},\"spec\":{\"ports\":[{\"name\":\"web\",\"port\":80},{\"name\":\"websecure\",\"port\":443},{\"name\":\"metrics\",\"port\":8080}],\"selector\":{\"app\":\"traefik\"}}}\n",
"prometheus.io/port": "8080",
"prometheus.io/scrape": "true"
}
},
"spec": {
"ports": [
{
"name": "web",
"protocol": "TCP",
"port": 80,
"targetPort": 80
},
{
"name": "websecure",
"protocol": "TCP",
"port": 443,
"targetPort": 443
},
{
"name": "metrics",
"protocol": "TCP",
"port": 8080,
"targetPort": 8080
}
],
"selector": {
"app": "traefik"
},
"clusterIP": "10.254.169.66",
"type": "ClusterIP",
"sessionAffinity": "None"
},
"status": {
"loadBalancer": {}
}
}
我阅读了一些文档,给出了我应该在 kubernetes(v1.15.2) 配置映射中配置 pull 任务的提示,如下所示:
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-config
namespace: kube-ops
data:
prometheus.yml: |
global:
scrape_interval: 30s
scrape_timeout: 30s
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'traefik'
static_configs:
- targets: ['traefik-ingress-service.kube-system.svc.cluster.local:8080']
我在我的普罗米修斯 yaml 中添加了配置。我错过了什么吗?我执行以下步骤:
- 公开 traefik 指标 url(成功)
- 向我的 traefik 服务添加注释(成功)
但是没有收集到指标数据,我在这个问题上停留了 2 天,我应该怎么做才能让它工作?这是我的 prometheus 的服务发现仪表板:
但是当我从普罗米修斯查询数据时,我什么也没找到。
http_requests_total{job="traefik"}
解决方案
注意treafik新版本(v2.1.6)的request query检查pull数据是:
traefik_entrypoint_requests_total{job="traefik"}
可以看到prometheus拉取数据成功。
推荐阅读
- ios - 是否可以部分执行 ARWorldMap 重新定位而不是立即添加所有 ARAnchors?
- java - 设置属性以在 tomcat 启动时禁用 ehcache
- sql - 在 SAS 中编写递归代码的正确方法
- azure - Azure 搜索:关键字标记器不适用于多字搜索
- azure - IDX10503:Microsoft Graph 和 Azure AD 的签名验证失败
- javascript - jquery函数销毁滚动事件
- javascript - react-redux:无法读取未定义的属性“isLogin”
- android - RecyclerView 按位置获取视图
- python - CNTK 在动态轴上的聚集操作
- javascript - 如何在数据表中使用 Rowspan