kubernetes-helm - 使用 Helm3 在 Kubernetes Minikube 上安装 Prometheus Operator 时出现问题
问题描述
我一直在尝试使用 Prometheus 来监控 http_request_rate 和/或 packet_per_second 的 pod 统计信息。为此,我计划使用 Prometheus Adapter,根据我的阅读,需要使用 Prometheus Operator。
我在从 helm stable 图表安装 Prometheus Operator 时遇到问题。运行安装命令“helm install prom stable/prometheus-operator”时,我收到以下警告消息显示六次
$ manifest_sorter.go:192 info: skipping unknown hook: "crd-install".
安装继续并部署 pod,但是 prometheus-node-exporter pod 进入状态:CrashLoopBackOff。
我看不到详细的原因,因为描述 pod 时的消息是“Back-off restarting failed container”
我在版本:1.7.2 上运行 Minikube。
我在版本:3.1.1 上运行 Helm。
>>>更新<<<
描述有问题的 Pod 的输出
> $ kubectl describe pod prom-oper-prometheus-node-exporter-2m6vm -n default
>
> Name: prom-oper-prometheus-node-exporter-2m6vm Namespace:
> default Priority: 0 Node: max-ubuntu/10.2.40.198 Start
> Time: Wed, 04 Mar 2020 18:06:44 +0000 Labels:
> app=prometheus-node-exporter
> chart=prometheus-node-exporter-1.8.2
> controller-revision-hash=68695df4c5
> heritage=Helm
> jobLabel=node-exporter
> pod-template-generation=1
> release=prom-oper Annotations: <none> Status: Running IP: 10.2.40.198 IPs: IP: 10.2.40.198
> Controlled By: DaemonSet/prom-oper-prometheus-node-exporter
> Containers: node-exporter:
> Container ID: docker://50b2398f72a0269672c4ac73bbd1b67f49732362b4838e16cd10e3a5247fdbfe
> Image: quay.io/prometheus/node-exporter:v0.18.1
> Image ID: docker-pullable://quay.io/prometheus/node-exporter@sha256:a2f29256e53cc3e0b64d7a472512600b2e9410347d53cdc85b49f659c17e02ee
> Port: 9100/TCP
> Host Port: 9100/TCP
> Args:
> --path.procfs=/host/proc
> --path.sysfs=/host/sys
> --web.listen-address=0.0.0.0:9100
> --collector.filesystem.ignored-mount-points=^/(dev|proc|sys|var/lib/docker/.+)($|/)
> --collector.filesystem.ignored-fs-types=^(autofs|binfmt_misc|cgroup|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|mqueue|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|sysfs|tracefs)$
> State: Waiting
> Reason: CrashLoopBackOff
> Last State: Terminated
> Reason: Error
> Exit Code: 1
> Started: Wed, 04 Mar 2020 18:10:10 +0000
> Finished: Wed, 04 Mar 2020 18:10:10 +0000
> Ready: False
> Restart Count: 5
> Liveness: http-get http://:9100/ delay=0s timeout=1s period=10s #success=1 #failure=3
> Readiness: http-get http://:9100/ delay=0s timeout=1s period=10s #success=1 #failure=3
> Environment: <none>
> Mounts:
> /host/proc from proc (ro)
> /host/sys from sys (ro)
> /var/run/secrets/kubernetes.io/serviceaccount from prom-oper-prometheus-node-exporter-token-n9dj9 (ro) Conditions: Type
> Status Initialized True Ready False
> ContainersReady False PodScheduled True Volumes: proc:
> Type: HostPath (bare host directory volume)
> Path: /proc
> HostPathType: sys:
> Type: HostPath (bare host directory volume)
> Path: /sys
> HostPathType: prom-oper-prometheus-node-exporter-token-n9dj9:
> Type: Secret (a volume populated by a Secret)
> SecretName: prom-oper-prometheus-node-exporter-token-n9dj9
> Optional: false QoS Class: BestEffort Node-Selectors: <none> Tolerations: :NoSchedule
> node.kubernetes.io/disk-pressure:NoSchedule
> node.kubernetes.io/memory-pressure:NoSchedule
> node.kubernetes.io/network-unavailable:NoSchedule
> node.kubernetes.io/not-ready:NoExecute
> node.kubernetes.io/pid-pressure:NoSchedule
> node.kubernetes.io/unreachable:NoExecute
> node.kubernetes.io/unschedulable:NoSchedule Events: Type Reason Age From
> Message ---- ------ ---- ----
> ------- Normal Scheduled 5m26s default-scheduler Successfully assigned default/prom-oper-prometheus-node-exporter-2m6vm
> to max-ubuntu Normal Started 4m28s (x4 over 5m22s) kubelet,
> max-ubuntu Started container node-exporter Normal Pulled
> 3m35s (x5 over 5m24s) kubelet, max-ubuntu Container image
> "quay.io/prometheus/node-exporter:v0.18.1" already present on machine
> Normal Created 3m35s (x5 over 5m24s) kubelet, max-ubuntu
> Created container node-exporter Warning BackOff 13s (x30 over
> 5m18s) kubelet, max-ubuntu Back-off restarting failed container
有问题的 Pod 日志的输出
> $ kubectl logs prom-oper-prometheus-node-exporter-2m6vm -n default
> time="2020-03-04T18:18:02Z" level=info msg="Starting node_exporter
> (version=0.18.1, branch=HEAD,
> revision=3db77732e925c08f675d7404a8c46466b2ece83e)"
> source="node_exporter.go:156" time="2020-03-04T18:18:02Z" level=info
> msg="Build context (go=go1.12.5, user=root@b50852a1acba,
> date=20190604-16:41:18)" source="node_exporter.go:157"
> time="2020-03-04T18:18:02Z" level=info msg="Enabled collectors:"
> source="node_exporter.go:97" time="2020-03-04T18:18:02Z" level=info
> msg=" - arp" source="node_exporter.go:104" time="2020-03-04T18:18:02Z"
> level=info msg=" - bcache" source="node_exporter.go:104"
> time="2020-03-04T18:18:02Z" level=info msg=" - bonding"
> source="node_exporter.go:104" time="2020-03-04T18:18:02Z" level=info
> msg=" - conntrack" source="node_exporter.go:104"
> time="2020-03-04T18:18:02Z" level=info msg=" - cpu"
> source="node_exporter.go:104" time="2020-03-04T18:18:02Z" level=info
> msg=" - cpufreq" source="node_exporter.go:104"
> time="2020-03-04T18:18:02Z" level=info msg=" - diskstats"
> source="node_exporter.go:104" time="2020-03-04T18:18:02Z" level=info
> msg=" - edac" source="node_exporter.go:104"
> time="2020-03-04T18:18:02Z" level=info msg=" - entropy"
> source="node_exporter.go:104" time="2020-03-04T18:18:02Z" level=info
> msg=" - filefd" source="node_exporter.go:104"
> time="2020-03-04T18:18:02Z" level=info msg=" - filesystem"
> source="node_exporter.go:104" time="2020-03-04T18:18:02Z" level=info
> msg=" - hwmon" source="node_exporter.go:104"
> time="2020-03-04T18:18:02Z" level=info msg=" - infiniband"
> source="node_exporter.go:104" time="2020-03-04T18:18:02Z" level=info
> msg=" - ipvs" source="node_exporter.go:104"
> time="2020-03-04T18:18:02Z" level=info msg=" - loadavg"
> source="node_exporter.go:104" time="2020-03-04T18:18:02Z" level=info
> msg=" - mdadm" source="node_exporter.go:104"
> time="2020-03-04T18:18:02Z" level=info msg=" - meminfo"
> source="node_exporter.go:104" time="2020-03-04T18:18:02Z" level=info
> msg=" - netclass" source="node_exporter.go:104"
> time="2020-03-04T18:18:02Z" level=info msg=" - netdev"
> source="node_exporter.go:104" time="2020-03-04T18:18:02Z" level=info
> msg=" - netstat" source="node_exporter.go:104"
> time="2020-03-04T18:18:02Z" level=info msg=" - nfs"
> source="node_exporter.go:104" time="2020-03-04T18:18:02Z" level=info
> msg=" - nfsd" source="node_exporter.go:104"
> time="2020-03-04T18:18:02Z" level=info msg=" - pressure"
> source="node_exporter.go:104" time="2020-03-04T18:18:02Z" level=info
> msg=" - sockstat" source="node_exporter.go:104"
> time="2020-03-04T18:18:02Z" level=info msg=" - stat"
> source="node_exporter.go:104" time="2020-03-04T18:18:02Z" level=info
> msg=" - textfile" source="node_exporter.go:104"
> time="2020-03-04T18:18:02Z" level=info msg=" - time"
> source="node_exporter.go:104" time="2020-03-04T18:18:02Z" level=info
> msg=" - timex" source="node_exporter.go:104"
> time="2020-03-04T18:18:02Z" level=info msg=" - uname"
> source="node_exporter.go:104" time="2020-03-04T18:18:02Z" level=info
> msg=" - vmstat" source="node_exporter.go:104"
> time="2020-03-04T18:18:02Z" level=info msg=" - xfs"
> source="node_exporter.go:104" time="2020-03-04T18:18:02Z" level=info
> msg=" - zfs" source="node_exporter.go:104" time="2020-03-04T18:18:02Z"
> level=info msg="Listening on 0.0.0.0:9100"
> source="node_exporter.go:170" time="2020-03-04T18:18:02Z" level=fatal
> msg="listen tcp 0.0.0.0:9100: bind: address already in use"
> source="node_exporter.go:172"
解决方案
这是与Helm 3
. 它影响了许多图表作为argo或大使。您可以在Helm 文档信息中找到该crd-install
钩子已删除:
请注意,该
crd-install
钩子已被删除以支持crds/
Helm 3 中的目录。
我已经部署了这个图表,还获得了Helm
跳过未知钩子但对 pod 没有问题的信息。
另一种方法是CRD's
在安装图表之前创建。可以在此处找到执行此操作的步骤。
在第一步中,您有创建 CRD 的命令:
kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/release-0.36/example/prometheus-operator-crd/monitoring.coreos.com_alertmanagers.yaml
kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/release-0.36/example/prometheus-operator-crd/monitoring.coreos.com_podmonitors.yaml
kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/release-0.36/example/prometheus-operator-crd/monitoring.coreos.com_prometheuses.yaml
kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/release-0.36/example/prometheus-operator-crd/monitoring.coreos.com_prometheusrules.yaml
kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/release-0.36/example/prometheus-operator-crd/monitoring.coreos.com_servicemonitors.yaml
kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/release-0.36/example/prometheus-operator-crd/monitoring.coreos.com_thanosrulers.yaml
最后一步是执行Helm install
:
helm install --name my-release stable/prometheus-operator --set prometheusOperator.createCustomResource=false
但Helm 3
不会认--name
flag。
Error: unknown flag: --name
您必须删除此标志。它应该看起来像:
$ helm install prom-oper stable/prometheus-operator --set prometheusOperator.createCustomResource=false
NAME: prom-oper
LAST DEPLOYED: Wed Mar 4 14:12:35 2020
NAMESPACE: default
STATUS: deployed
REVISION: 1
NOTES:
The Prometheus Operator has been installed. Check its status by running:
kubectl --namespace default get pods -l "release=prom-oper"
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
alertmanager-prom-oper-prometheus-opera-alertmanager-0 2/2 Running 0 9m46s
...
prom-oper-prometheus-node-exporter-25b27 1/1 Running 0 9m56s
如果您对 repo 有一些问题,您只需要执行:
helm repo add stable https://kubernetes-charts.storage.googleapis.com
helm repo update
如果这种替代方式没有帮助,请添加到您的问题输出中:
kubectl describe <pod-name> -n <pod-namespace>
和kubectl logs <pod-name> -n <pod-namespace>
推荐阅读
- highcharts - Highchart Pie 聊天不显示 Codeigniter
- sql - SQL Server:如何显示 case 语句中的占位符行?
- ruby - 由于旧的 libxml2,厨师命令失败
- amazon-web-services - AWS CloudFormation 转换 - 如何正确返回错误消息?
- ace-editor - 更改在 ace-editor 中为令牌显示的文本
- opengl - openGL-绘制四边形网格并手动绘制它们
- c++ - 如何在 Visual Studio 2019 中使用 cmake 项目进行“编辑并继续”构建?
- javascript - 在逐个测试的基础上开玩笑模拟类方法
- r - 在 R 闪亮应用的操作按钮中切换图标
- c# - 日志中的 IIS 超时和 sc-win32-status 64