首页 > 解决方案 > Kafka Pod 无法在 GKE 上启动

问题描述

我按照本教程进行操作,当我尝试在 GKE 上运行它时,我无法启动kafkapod。

它一直返回CrashLoopBackOff。而且我不知道如何显示 pod 错误日志。

这是我点击时的结果kubectl describe pod my-pod-xxx

Name:           kafka-broker1-54cb95fb44-hlj5b
Namespace:      default
Node:           gke-xxx-default-pool-f9e313ed-zgcx/10.146.0.4
Start Time:     Thu, 25 Oct 2018 11:40:21 +0900
Labels:         app=kafka
                id=1
                pod-template-hash=1076519600
Annotations:    kubernetes.io/limit-ranger=LimitRanger plugin set: cpu request for container kafka
Status:         Running
IP:             10.48.8.10
Controlled By:  ReplicaSet/kafka-broker1-54cb95fb44
Containers:
  kafka:
    Container ID:   docker://88ee6a1df4157732fc32b7bd8a81e329dbdxxxx9cbe614689e775d183dbcd61
    Image:          wurstmeister/kafka
    Image ID:       docker-pullable://wurstmeister/kafka@sha256:4f600a95fa1288f7b1xxxxxa32ca00b4fb13b83b31533fa6b40499bd9bdf192f
    Port:           9092/TCP
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    137
      Started:      Thu, 25 Oct 2018 14:35:32 +0900
      Finished:     Thu, 25 Oct 2018 14:35:51 +0900
    Ready:          False
    Restart Count:  37
    Requests:
      cpu:  100m
    Environment:
      KAFKA_ADVERTISED_PORT:       9092
      KAFKA_ADVERTISED_HOST_NAME:  35.194.100.32
      KAFKA_ZOOKEEPER_CONNECT:     zoo1:2181
      KAFKA_BROKER_ID:             1
      KAFKA_CREATE_TOPICS:         topic1:3:3
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-w6s7n (ro)
Conditions:
  Type           Status
  Initialized    True
  Ready          False
  PodScheduled   True
Volumes:
  default-token-w6s7n:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-w6s7n
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason   Age                From                                                   Message
  ----     ------   ----               ----                                                   -------
  Warning  BackOff  5m (x716 over 2h)  kubelet, gke-xxx-default-pool-f9e313ed-zgcx  Back-off restarting failed container
  Normal   Pulling  36s (x38 over 2h)  kubelet, gke-xxxdefault-pool-f9e313ed-zgcx  pulling image "wurstmeister/kafka"

我注意到在第一次运行时它运行良好,但在那之后,Node正在将状态更改为NotReadykafka pod进入CrashLoopBackOff 状态。

这是它关闭之前的日志:

Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 5m default-scheduler Successfully assigned kafka-broker1-54cb95fb44-wwf2h to gke-xxx-default-pool-f9e313ed-8mr6 Normal SuccessfulMountVolume 5m kubelet, gke-xxx-default-pool-f9e313ed-8mr6 MountVolume.SetUp succeeded for volume "default-token-w6s7n" Normal Pulling 5m kubelet, gke-xxx-default-pool-f9e313ed-8mr6 pulling image "wurstmeister/kafka" Normal Pulled 5m kubelet, gke-xxx-default-pool-f9e313ed-8mr6 Successfully pulled image "wurstmeister/kafka" Normal Created 5m kubelet, gke-xxx-default-pool-f9e313ed-8mr6 Created container Normal Started 5m kubelet, gke-xxx-default-pool-f9e313ed-8mr6 Started container Normal NodeControllerEviction 38s node-controller Marking for deletion Pod kafka-broker1-54cb95fb44-wwf2h from Node gke-dev-centurion-default-pool-f9e313ed-8mr6

谁能告诉我我的 pod 出了什么问题,我怎样才能捕捉到 pod 故障的错误?

标签: kubernetesapache-kafkagoogle-kubernetes-engine

解决方案


我刚刚发现我的集群的节点没有足够的资源。创建一个具有更多内存的新集群后,它就可以工作了。


推荐阅读