kubernetes - 根据每个 pod 的活动连接数扩展 GKE pod
问题描述
我有一个使用目标 CPU 利用率指标的带有 HPA 的正在运行的 GKE 集群。这没关系,但 CPU 利用率并不是我们的最佳扩展指标。分析表明,活动连接数是一般平台负载的一个很好的指标,因此,我们希望将其作为我们的主要扩展指标。
为此,我为我们使用的 NGINX 入口启用了自定义指标。从这里我们可以看到活动连接数、请求率等。
这是使用 NGINX 自定义指标的 HPA 规范:
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: hpa-uat-active-connections
namespace: default
spec:
minReplicas: 3
maxReplicas: 6
metrics:
- type: Pods
pods:
metricName: custom.googleapis.com|nginx-ingress-controller|nginx_ingress_controller_nginx_process_connections
selector:
matchLabels:
metric.labels.state: active
resource.labels.cluster_name: "[redacted]"
targetAverageValue: 5
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: "[redacted]"
然而,虽然这个规范确实部署得很好,但我总是从 HPA 得到这个输出:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
hpa-uat-active-connections Deployment/[redacted] <unknown>/5 3 6 3 31s
简而言之,目标值是“未知的”,到目前为止我无法理解/解决原因。自定义指标确实存在:
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/default/custom.googleapis.com|nginx-ingress-controller|nginx_ingress_controller_nginx_process_connections?labelSelector=metric.labels.state%3Dactive,resource.labels .cluster_name%3D[编辑]" | jq
这使:
{
"kind": "ExternalMetricValueList",
"apiVersion": "external.metrics.k8s.io/v1beta1",
"metadata": {
"selfLink": "/apis/external.metrics.k8s.io/v1beta1/namespaces/default/custom.googleapis.com%7Cnginx-ingress-controller%7Cnginx_ingress_controller_nginx_process_connections"
},
"items": [
{
"metricName": "custom.googleapis.com|nginx-ingress-controller|nginx_ingress_controller_nginx_process_connections",
"metricLabels": {
"metric.labels.controller_class": "nginx",
"metric.labels.controller_namespace": "ingress-nginx",
"metric.labels.controller_pod": "nginx-ingress-controller-54f84b8dff-sml6l",
"metric.labels.state": "active",
"resource.labels.cluster_name": "[redacted]",
"resource.labels.container_name": "",
"resource.labels.instance_id": "[redacted]-eac4b327-stqn",
"resource.labels.namespace_id": "ingress-nginx",
"resource.labels.pod_id": "nginx-ingress-controller-54f84b8dff-sml6l",
"resource.labels.project_id": "[redacted],
"resource.labels.zone": "[redacted]",
"resource.type": "gke_container"
},
"timestamp": "2019-12-30T14:11:01Z",
"value": "1"
}
]
}
所以我有两个问题,真的:
- (主要的):我在这里做错了什么导致 HPA 无法读取指标?
- 这是尝试扩展到多个 pod 上的平均活动连接负载的正确方法吗?
非常感谢,本
编辑 1
kubectl 得到所有
NAME READY STATUS RESTARTS AGE
pod/[redacted]-deployment-7f5fbc9ddf-l9tqk 1/1 Running 0 34h
pod/[redacted]-uat-deployment-7f5fbc9ddf-pbcns 1/1 Running 0 34h
pod/[redacted]-uat-deployment-7f5fbc9ddf-tjfrm 1/1 Running 0 34h
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/[redacted]-webapp-service NodePort [redacted] <none> [redacted] 57d
service/kubernetes ClusterIP [redacted] <none> [redacted] 57d
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/[redacted]-uat-deployment 3/3 3 3 57d
NAME DESIRED CURRENT READY AGE
replicaset.apps/[redacted]-uat-deployment-54b6bd5f9c 0 0 0 12d
replicaset.apps/[redacted]-uat-deployment-574c778cc9 0 0 0 35h
replicaset.apps/[redacted]-uat-deployment-66546bf76b 0 0 0 11d
replicaset.apps/[redacted]-uat-deployment-698dfbb6c4 0 0 0 4d
replicaset.apps/[redacted]-uat-deployment-69b5c79d54 0 0 0 6d17h
replicaset.apps/[redacted]-uat-deployment-6f67ff6599 0 0 0 10d
replicaset.apps/[redacted]-uat-deployment-777bfdbb9d 0 0 0 3d23h
replicaset.apps/[redacted]-uat-deployment-7f5fbc9ddf 3 3 3 34h
replicaset.apps/[redacted]-uat-deployment-9585454ff 0 0 0 6d21h
replicaset.apps/[redacted]-uat-deployment-97cbcfc6 0 0 0 17d
replicaset.apps/[redacted]-uat-deployment-c776f648d 0 0 0 10d
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
horizontalpodautoscaler.autoscaling/[redacted]-uat-deployment Deployment/[redacted]-uat-deployment 4%/80% 3 6 3 9h
解决方案
好的,我设法通过查找 HPA 的架构来解决这个问题(https://docs.okd.io/latest/rest_api/apis-autoscaling/v2beta1.HorizontalPodAutoscaler.html)。
简而言之,我使用了错误的度量类型(如上您可以看到我使用的是“Pods”,但我应该使用“External”)。
正确的 HPA 规范是:
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: hpa-uat-active-connections
namespace: default
spec:
minReplicas: 3
maxReplicas: 6
metrics:
- type: External
external:
metricName: custom.googleapis.com|nginx-ingress-controller|nginx_ingress_controller_nginx_process_connections
metricSelector:
matchLabels:
metric.labels.state: active
resource.labels.cluster_name: [redacted]
targetAverageValue: 5
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: [redacted]
一旦我这样做了,事情马上就奏效了:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
hpa-uat-active-connections Deployment/bustle-webapp-uat-deployment 334m/5 (avg) 3 6 3 30s
推荐阅读
- android - Google MLKit 条码扫描无法在 Codabar 中读取少于 5 位的数字
- sql - Azure 应用服务身份验证 - SQL 数据库客户端分片
- sql - 如何在sql中将4个表连接在一起
- python-3.x - 如何在期望用户输入()时检测“ESC”按键
- javascript - 在 react-native 升级后,finally 参数显示未定义
- angular - 指令中的 EventEmitter 不在 Parent 中处理
- python - 根据条件替换熊猫中的值
- c++ - 如何实现线程安全的日志记录?
- maven - 如何从非 build-gradle 文件访问 Kotlin DSL 扩展?
- python - 将当前行值与前一行值进行比较