首页 > 解决方案 > Kubernetes Pod 终止不会立即发生,必须等到宽限期到期

问题描述

我有一个掌舵图,其中包含一个部署/pod 和一项服务。我将部署终止GracePeriodSeconds 设置为300s。我没有任何 pod 生命周期钩子,所以如果我终止 pod,则 pod 应该立即终止。但是,现在 pod 将确定直到我的宽限期结束!

下面是我的 pod 的部署模板:

$ kubectl get pod hpa-poc---jcc-7dbbd66d86-xtfc5 -o yaml
apiVersion: v1
kind: Pod
metadata:
  annotations:
    kubernetes.io/psp: eks.privileged
  creationTimestamp: "2021-02-01T18:12:34Z"
  generateName: hpa-poc-jcc-7dbbd66d86-
  labels:
    app.kubernetes.io/instance: hpa-poc
    app.kubernetes.io/name: -
    pod-template-hash: 7dbbd66d86
  name: hpa-poc-jcc-7dbbd66d86-xtfc5
  namespace: default
  ownerReferences:
  - apiVersion: apps/v1
    blockOwnerDeletion: true
    controller: true
    kind: ReplicaSet
    name: hpa-poc-jcc-7dbbd66d86
    uid: 66db29d8-9e2d-4097-94fc-b0b827466e10
  resourceVersion: "127938945"
  selfLink: /api/v1/namespaces/default/pods/hpa-poc-jcc-7dbbd66d86-xtfc5
  uid: 82ed4134-95de-4093-843b-438e94e408dd
spec:
  containers:
  - env:
    - name: _CONFIG_LINK
      value: xxx
    - name: _USERNAME
      valueFrom:
        secretKeyRef:
          key: username
          name: hpa-jcc-poc
    - name: _PASSWORD
      valueFrom:
        secretKeyRef:
          key: password
          name: hpa-jcc-poc
    image: xxx
    imagePullPolicy: IfNotPresent
    name: -
    resources:
      limits:
        cpu: "2"
        memory: 8Gi
      requests:
        cpu: 500m
        memory: 2Gi
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: default-token-hzmwh
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  nodeName: xxx
  priority: 0
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext: {}
  serviceAccount: default
  serviceAccountName: default
  terminationGracePeriodSeconds: 300
  tolerations:
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  volumes:
  - name: default-token-hzmwh
    secret:
      defaultMode: 420
      secretName: default-token-hzmwh
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2021-02-01T18:12:34Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2021-02-01T18:12:36Z"
    status: "True"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2021-02-01T18:12:36Z"
    status: "True"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2021-02-01T18:12:34Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: docker://c4c969ec149f43ff4494339930c8f0640d897b461060dd810c63a5d1f17fdc47
    image: xxx
    imageID: xxx
    lastState: {}
    name: -
    ready: true
    restartCount: 0
    state:
      running:
        startedAt: "2021-02-01T18:12:35Z"
  hostIP: 10.0.35.137
  phase: Running
  podIP: 10.0.21.35
  qosClass: Burstable
  startTime: "2021-02-01T18:12:34Z"

当我尝试终止 pod(我使用了helm delete命令)时,您可以看到它在 5 分钟后终止,这是宽限期时间。

$ helm delete hpa-poc
release "hpa-poc" uninstalled
$ kubectl get pod -w | grep hpa
hpa-poc-jcc-7dbbd66d86-xtfc5         1/1     Terminating   0          3h10m
hpa-poc-jcc-7dbbd66d86-xtfc5         0/1     Terminating   0          3h15m
hpa-poc-jcc-7dbbd66d86-xtfc5         0/1     Terminating   0          3h15m

所以我怀疑这是我的 pod/container 配置问题。因为我已经尝试过其他简单的 Java App 部署,一旦我终止 pod,它就可以立即终止。

顺便说一句,我正在使用 AWS EKS 集群。也不确定它的 AWS 特定。

那么有什么建议吗?

标签: kuberneteskubernetes-helmkubernetes-podamazon-eks

解决方案


我发现问题。当我执行到容器中时,我注意到有一个进程正在运行,它是拖尾日志进程。

因此,我需要终止该进程并将其添加到 prestop 挂钩中。之后,我的容器可以立即关闭。


推荐阅读