kubernetes - GKE's cluster autoscaler got stucked in initializing status
问题描述
I was optimizing cluster (GKE) utilization recently and 2 days ago I've noticed that my nodes are not scaling up or down. Autoscaling config map is in initialization mode:
kubectl describe -n kube-system configmap cluster-autoscaler-status
Name: cluster-autoscaler-status
Namespace: kube-system
Labels: <none>
Annotations: cluster-autoscaler.kubernetes.io/last-updated: 2020-04-29 14:44:54.363091383 +0000 UTC
Data
====
status:
----
Cluster-autoscaler status at 2020-04-29 14:44:54.363091383 +0000 UTC:
Initializing
Events: <none>
The other clusters contain proper autoscaling events. I think that I could overload cluster with the number of pods. It contains ~100 pods / node.
Update 1:
- What GKE version running on master?: 1.14.10-gke.27, but I thought the upgrade to 1.15.11-gke.9 would help (and will master somehow). It didn't help. We have other clusters with those same versions and pools.
- Does it happen to any node pools or is it occurring to a specific one?: Autoscaling config map is kind of "global level", so all node pools are being affected.
- Could you provide the pool sizes, gke-versions and autoscaling settings?
default OK 1.14.10-gke.27 4 (2 per zone) custom-8-45056 Container-Optimized OS (cos) 0 - 8 nodes per zone
preemptible8-2 OK 1.14.10-gke.27 10 (5 per zone) n1-standard-8 Container-Optimized OS (cos) 0 - 20 nodes per zone
scalability-stable-2-cpu OK 1.14.10-gke.27 1 (0 - 1 per zone) n1-standard-2 Container-Optimized OS (cos) 0 - 4 nodes per zone
Additional information:
- When it turned off autoscaling and turned on in every node pool, the output of
kubectl describe -n kube-system configmap cluster-autoscaler-status has changed
. - I thought it might happen when I was changing the settings of the: scalability-stable-2-cpu.
解决方案
3天后恢复正常。
推荐阅读
- c# - Azure 函数使用 SharePoint Online 和 CSOM .NET Standard 更新用户配置文件会拒绝访问
- html - 具有异步属性的脚本在获取后如何执行
- javascript - document.getElementById("numbern").value 导致 Uncaught TypeError: Cannot read property 'value' of undefined
- node.js - 如何将更多字段(在猫鼬模式中)添加到字段(这是一个对象数组)中,并且这些对象是对另一个猫鼬模式的引用?
- javascript - 按下按钮时日期时间选择器未打开
- android - Android dagger 2 未在 android studio 4.1 中生成
- forms - 如何为 swift UI 模式中的每个选项卡提供标题
- c# - 即使数组大小足够,C# IndexOutOfRangeException
- android - 在 Kotlin 代码中将 focusable 和 focusableInTouchMode 属性设置为 false
- input - 仅在“层节点”上方输入