首页 > 解决方案 > Preventing Docker Container CPU Resets?

问题描述

This is a tricky one and is a little hard to explain but I will give it a shot to see if anyone out there has had a similar issue + fix.

Quick background:
Running a large Java Spring App on Tomcat in a Docker container. Other containers are simple, 1 for a JMS Queue and the other for Mysql. I run on Windows and have given Docker as much CPU as I have (and memory too). I have set JAVA_OPTS for Catalina to max out memory as well as memory limits in my docker-compose, but the issue seems to be CPU related.

When the app is idling it normally is sitting around 103% CPU (8 Cores, 800% max). There is a process we use which (using a Thread Pool) runs some workers to go out and run some code. On my local host (no docker in between) it runs very fast and flies, spitting out logs at a good clip.

Problem: When running in Docker watching docker stats -a I can see the CPU start to ramp up when this process begins. Meanwhile in the logs, everything is flying by like expected while the CPU grows and grows. It seems to get close to 700% and then it kind of dies, but it doesn't. When it hits this threshold I see the CPU drop drastically down to < 5% where it stays for a little while. At this time logs stop printing, so I assume nothing is happening. Eventually it will kick back in and go back ~120% and continue its process like nothing happened sometimes respiking to ~400%.

What I am trying

I have played around with the memory settings to no success but it seems more like a CPU issue. I know Java in Docker is a bit wonky but I have given it all the room I can on my beefy dev box where locally this process runs without a hitch. I find it odd the CPU spikes then dies, but the container itself doesn't die or reset. Has anyone seen a similar issue or know some ways to further attack this CPU issue with Docker?

Thanks.

标签: javaspringdockertomcat

解决方案


JVM 容器中的资源分配存在问题,因为它指的是整个系统矩阵而不是容器矩阵。在 JAVA 7 和 8 中,JVM 人体工程学正在应用系统的(实例)矩阵,例如内核和内存的数量,而不是 docker 分配的资源(内核和内存)。因此,JVM 根据核心数和内存初始化了许多参数,如下所示。

JVM 内存占用

-烫发/元​​空间

-JIT 字节码

-堆大小(JVM 人体工程学 ¼ 实例内存)

中央处理器

-不。JIT 编译器线程

-不。垃圾收集线程

-不。公共 fork-join 池中的线程

因此,容器往往会因 CPU 过高而变得无响应或被 OOM kill 终止容器。原因是 JVM 忽略了容器 CGGroups 和命名空间,以限制内存和 CPU 周期。因此,JVM 倾向于获取更多的实例资源,而不是限制 docker 分配资源的单独分配。E

例子

假设两个容器在 8GB 内存的 4 核实例上运行。说到 docker 初始化点,假设 docker 有 1GB 内存和 2048 个 CPU 周期作为硬限制。在这里,每个容器都有 4 个内核,这些 JVM 根据它们的统计数据分别分配内存、JIT 编译器和 GC 线程。但是,JVM 将看到该实例上的内核总数 (4),并使用该值来初始化我们之前看到的默认线程数。因此,两个容器的 JVM 矩阵将如下所述。

-4 * 2 Jit 编译器线程

-4 * 2 垃圾收集线程

-2 GB 堆大小 * 2(实例完整内存的 ¼,而不是 docker 分配的内存)

在内存方面

根据上面的示例,JVM 将逐渐增加堆使用量,因为 JVM 看到 2GB 堆最大大小,即实例内存 (8GB) 的四分之一。一旦容器的内存使用量达到硬限制 1GB,容器将被 OOM kill 终止。

CPU方面

根据上面的示例,一个 JVM 已使用 4 个垃圾收集线程和 4 个 JIT 编译器进行了初始化。但是,docker 只分配了 2048 个 CPU 周期。因此会导致 CPU 高,上下文切换较多,容器无响应,最终会因为 CPU 高而终止容器。

解决方案基本上,有两个进程,即 CGGroups 和 Namespaces,它们在操作系统级别处理这种情况。但是,JAVA 7 和 8 不接受 CGgroup 和 Namespaces,但 jdk_1.8.131 之后的版本能够通过 JVM 参数(-XX:+UseCGroupMemoryLimitForHeap, -XX:+UnlockExperimentalVMOptions)启用 CGroup 限制。但是,它只提供内存问题的解决方案,而不关心 CPU 设置问题。

使用 OpenJDK 9,JVM 将自动检测 CPUset。特别是在编排中,它还可以通过使用 JVM 标志(XX:ParallelGCThreads,XX:ConcGCThreads)根据容器上的 CPU 周期计数手动覆盖 CPU 设置线程计数的默认参数。


推荐阅读