首页 > 解决方案 > GKE: How to handle deployments with CPU intensive initialization?

问题描述

I have a GKE cluster (n1-standard-1, master version 1.13.6-gke.13) with 3 nodes on which I have 7 deployments, each running a Spring Boot application. A default Horizontal Pod Autoscaler was created for each deployment, with target CPU 80% and min 1 / max 5 replicas.

During normal operation, there is typically 1 pod per deployment and CPU usage at 1-5%. But when the application starts, e.g after performing a rolling update, the CPU usage spikes and the HPA scales up to max number of replicas reporting CPU usage at 500% or more.

When multiple deployments are started at the same time, e.g after a cluster upgrade, it often causes various pods to be unschedulable because it's out of CPU, and some pods are at "Preemting" state.

I have changed the HPAs to max 2 replicas since currently that's enough. But I will be adding more deployments in the future and it would be nice to know how to handle this correctly. I'm quite new to Kubernetes and GCP so I'm not sure how to approach this.

Here is the CPU chart for one of the containers after a cluster upgrade earlier today:

CPU usage

Everything runs in the default namespace and I haven't touched the default LimitRange with 100m default CPU request. Should I modify this and set limits? Given that the initialization is resource demanding, what would the proper limits be? Or do I need to upgrade the machine type with more CPU?

标签: spring-bootkubernetesgoogle-kubernetes-engineautoscaling

解决方案


HPA 仅考虑准备好的 pod。由于您的 pod 只会在早期阶段经历 CPU 使用率的峰值,因此最好的办法是配置一个就绪探针,该探针仅在 CPU 使用率下降或将 initialDelaySeconds 设置为长于启动周期时才显示为就绪状态,以确保峰值在HPA 不考虑 CPU 使用率。


推荐阅读