首页 > 解决方案 > Kubernetes 上的 Spring Boot 不会在 java.lang.OutOfMemoryError: Java heap space 上重新启动

问题描述

我一直在使用 JDK 11 映像在 Kubernetes 上运行 Spring Boot 应用程序。我的期望是,当 JVM 遇到内存不足异常时,应该杀死 pod,以便 Kubernetes 可以启动更大的 pod。我可以确认这不是正在发生的事情。我不确定是否必须设置一些我缺少的 JVM 参数,或者可能是一些 Kubernetes 配置来了解这种情况。

我正在使用以下 JVM 参数:

-XX:InitialRAMPercentage=20.0 -XX:MinRAMPercentage=50.0 -XX:MaxRAMPercentage=80.0 -XX:+HeapDumpOnOutOfMemoryError -XX:+ExitOnOutofMemoryError

抛出的异常:

{
  "message": "Stopping container due to an Error",
  "logger_name": "org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer",
  "thread_name": "KafkaConsumerDestination{consumerDestinationName='message-submitted', partitions=21, dlqName='dlq'}.container-0-C-1",
  "level": "ERROR",
  "stack_trace": "java.lang.OutOfMemoryError: Java heap space\n\tat java.base/java.util.Arrays.copyOf(Arrays.java:3745)\n\tat java.base/java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:120)\n\tat java.base/java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:95)\n\tat java.base/java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:156)\n\tat software.amazon.awssdk.utils.IoUtils.toByteArray(IoUtils.java:49)\n\tat software.amazon.awssdk.core.sync.ResponseTransformer.lambda$toBytes$3(ResponseTransformer.java:175)\n\tat software.amazon.awssdk.core.sync.ResponseTransformer$$Lambda$1517/0x0000000101087040.transform(Unknown Source)\n\tat software.amazon.awssdk.core.client.handler.BaseSyncClientHandler$HttpResponseHandlerAdapter.transformResponse(BaseSyncClientHandler.java:154)\n\tat software.amazon.awssdk.core.client.handler.BaseSyncClientHandler$HttpResponseHandlerAdapter.handle(BaseSyncClientHandler.java:142)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.handleSuccessResponse(HandleResponseStage.java:89)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.handleResponse(HandleResponseStage.java:70)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.execute(HandleResponseStage.java:58)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.execute(HandleResponseStage.java:41)\n\tat software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:64)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:36)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:77)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:39)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage$RetryExecutor.doExecute(RetryableStage.java:113)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage$RetryExecutor.execute(RetryableStage.java:86)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:62)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:42)\n\tat software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)\n\tat software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:57)\n\tat software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:37)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.executeWithTimer(ApiCallTimeoutTrackingStage.java:80)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:60)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:42)\n\tat software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)\n\tat software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:37)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:26)\n"
}


{
  "message": "Error while stopping the container: ",
  "logger_name": "org.springframework.kafka.listener.KafkaMessageListenerContainer",
  "thread_name": "KafkaConsumerDestination{consumerDestinationName='message-submitted', partitions=21, dlqName='dlq'}.container-0-C-1",
  "level": "ERROR",
  "stack_trace": "java.lang.OutOfMemoryError: Java heap space\n\tat java.base/java.util.Arrays.copyOf(Arrays.java:3745)\n\tat java.base/java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:120)\n\tat java.base/java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:95)\n\tat java.base/java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:156)\n\tat software.amazon.awssdk.utils.IoUtils.toByteArray(IoUtils.java:49)\n\tat software.amazon.awssdk.core.sync.ResponseTransformer.lambda$toBytes$3(ResponseTransformer.java:175)\n\tat software.amazon.awssdk.core.sync.ResponseTransformer$$Lambda$1517/0x0000000101087040.transform(Unknown Source)\n\tat software.amazon.awssdk.core.client.handler.BaseSyncClientHandler$HttpResponseHandlerAdapter.transformResponse(BaseSyncClientHandler.java:154)\n\tat software.amazon.awssdk.core.client.handler.BaseSyncClientHandler$HttpResponseHandlerAdapter.handle(BaseSyncClientHandler.java:142)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.handleSuccessResponse(HandleResponseStage.java:89)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.handleResponse(HandleResponseStage.java:70)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.execute(HandleResponseStage.java:58)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.execute(HandleResponseStage.java:41)\n\tat software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:64)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:36)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:77)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:39)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage$RetryExecutor.doExecute(RetryableStage.java:113)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage$RetryExecutor.execute(RetryableStage.java:86)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:62)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:42)\n\tat software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)\n\tat software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:57)\n\tat software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:37)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.executeWithTimer(ApiCallTimeoutTrackingStage.java:80)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:60)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:42)\n\tat software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)\n\tat software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:37)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:26)\n"
}

我认为发生的事情是 OOM 异常导致 pod 关闭,然后在尝试关闭 pod 时抛出相同的异常。所以我试图通过添加来杀死 pod,-XX:OnOutOfMemoryError="kill -9 %p但它没有帮助。

稍有不同的是,pod 内存限制为 2Gi。但是,pod 在 700Mi 左右达到 OOM 异常,所以我认为没有足够的内存,只是 pod 甚至在尝试扩展内存之前就抛出了异常:

    resources:
      limits:
        cpu: "1"
        memory: 2Gi
      requests:
        cpu: 10m
        memory: 128Mi

我也进行了测试-XX:+CrashOnOutOfMemoryError,但这无助于解决我的情况,并且 pod 在尝试关闭容器时不断抛出 OOM。

标签: javakubernetesjvmout-of-memory

解决方案


Minikube 可能没有足够的内存。

您可以通过 minikube start -h 检查内存设置

然后 minikube stop && minikube start --cpus 4 --memory 2048

更新:尝试在启动 java 应用程序时设置堆大小 -Xms1g -Xmx2g


推荐阅读