首页 > 解决方案 > Consul 服务器无法加入 Consul 集群

问题描述

在我开始解释我的问题是什么之前,我想指出我对 Consul 没有很多经验,所以请耐心等待我:D 我需要你的帮助来弄清楚我部署的 Consul 出了什么问题在我的 Azure AKS 中。我拥有的基础设施如下所示:

一切都运行良好,但突然间,豆荚开始一个接一个地死亡。我重新部署了在 AKS 中运行的 Consul,现在我遇到的问题是我只有三个 Consul 服务器中的两个在运行。第三台服务器将处于运行状态大约 30 秒,然后它将成为 OOM 终止,然后进入 CrashLoopBackOff 状态。当我运行命令 consul members 时,我列出了所有服务器和客户端,有问题的 Pod 将显示为“left”,而其他 Pod 显示为“alive”。我也尝试执行命令 consul join {ip address} 但这给了我以下错误消息:

/# consul join 10.0.0.135 Error join address '10.0.0.135': Unexpected response code: 500 (1 error occurred: * Failed to join 10.0.0.135: dial tcp 10.0.0.135:8301: connect: connection denied

) 无法加入任何节点。

我从我的 Consul StatefulSet 中附加了 yaml 文件,并从有问题的 pod 中附加了错误日志。

我必须指出,我拥有这个基础设施可能有 2 个月的时间,一切看起来都很好,所有的 pod 都健康运行。在过去的 3 天里,我正在处理这个问题,在互联网上研究试图找出如何解决这个问题,但没有结果。

你能帮我弄清楚为什么突然开始发生这种情况并最终帮助我解决这个问题吗?

在此先感谢您的时间,

麦克风


YAML文件

 kind: StatefulSet
apiVersion: apps/v1
metadata:
  name: consul-consul-server
  namespace: consul
  selfLink: /apis/apps/v1/namespaces/consul/statefulsets/consul-consul-server
  uid: ddfb4383-8545-457d-8c3a-5dc7ec04f9f2
  resourceVersion: '19294440'
  generation: 11
  creationTimestamp: '2020-10-26T10:19:25Z'
  labels:
    app: consul
    app.kubernetes.io/managed-by: Helm
    chart: consul-helm
    component: server
    heritage: Helm
    release: consul
  annotations:
    meta.helm.sh/release-name: consul
    meta.helm.sh/release-namespace: consul
spec:
  replicas: 3
  selector:
    matchLabels:
      app: consul
      chart: consul-helm
      component: server
      hasDNS: 'true'
      release: consul
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: consul
        chart: consul-helm
        component: server
        hasDNS: 'true'
        release: consul
      annotations:
        consul.hashicorp.com/config-checksum: ca3d163bab055381827226140568f3bef7eaac187cebd76878e0b63e9e442356
        consul.hashicorp.com/connect-inject: 'false'
    spec:
      volumes:
        - name: config
          configMap:
            name: consul-consul-server-config
            defaultMode: 420
      containers:
        - name: consul
          image: 'consul:1.8.4'
          command:
            - /bin/sh
            - '-ec'
            - |
              CONSUL_FULLNAME="consul-consul"

              exec /bin/consul agent \
                -advertise="${HOST_IP}" \
                -bind=0.0.0.0 \
                -bootstrap-expect=3 \
                -client=0.0.0.0 \
                -config-dir=/consul/config \
                -datacenter=dc1 \
                -data-dir=/consul/data \
                -domain=consul \
                -hcl="connect { enabled = true }" \
                -ui \
                -retry-join=${CONSUL_FULLNAME}-server-0.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc \
                -retry-join=${CONSUL_FULLNAME}-server-1.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc \
                -retry-join=${CONSUL_FULLNAME}-server-2.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc \
                -server
          ports:
            - name: http
              hostPort: 8500
              containerPort: 8500
              protocol: TCP
            - name: serflan
              hostPort: 8301
              containerPort: 8301
              protocol: TCP
            - name: serfwan
              hostPort: 8302
              containerPort: 8302
              protocol: TCP
            - name: server
              hostPort: 8300
              containerPort: 8300
              protocol: TCP
            - name: dns-tcp
              hostPort: 8600
              containerPort: 8600
              protocol: TCP
            - name: dns-udp
              hostPort: 8600
              containerPort: 8600
              protocol: UDP
          env:
            - name: POD_IP
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: status.podIP
            - name: HOST_IP
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: status.hostIP
            - name: NAMESPACE
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: metadata.namespace
          resources:
            limits:
              cpu: 800m
              memory: 800Mi
            requests:
              cpu: 800m
              memory: 800Mi
          volumeMounts:
            - name: data-consul
              mountPath: /consul/data
            - name: config
              mountPath: /consul/config
          readinessProbe:
            exec:
              command:
                - /bin/sh
                - '-ec'
                - |
                  curl http://127.0.0.1:8500/v1/status/leader \
                  2>/dev/null | grep -E '".+"'
            initialDelaySeconds: 5
            timeoutSeconds: 5
            periodSeconds: 3
            successThreshold: 1
            failureThreshold: 2
          lifecycle:
            preStop:
              exec:
                command:
                  - /bin/sh
                  - '-c'
                  - consul leave
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          imagePullPolicy: IfNotPresent
      restartPolicy: Always
      terminationGracePeriodSeconds: 30
      dnsPolicy: ClusterFirst
      serviceAccountName: consul-consul-server
      serviceAccount: consul-consul-server
      hostNetwork: true
      securityContext:
        fsGroup: 1000
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchLabels:
                  app: consul
                  component: server
                  release: consul
              topologyKey: kubernetes.io/hostname
      schedulerName: default-scheduler
  volumeClaimTemplates:
    - kind: PersistentVolumeClaim
      apiVersion: v1
      metadata:
        name: data-consul
        creationTimestamp: null
      spec:
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 10Gi
        volumeMode: Filesystem
      status:
        phase: Pending
  serviceName: consul-consul-server
  podManagementPolicy: Parallel
  updateStrategy:
    type: RollingUpdate
    rollingUpdate:
      partition: 0
  revisionHistoryLimit: 10
status:
  observedGeneration: 11
  replicas: 3
  readyReplicas: 2
  currentReplicas: 3
  updatedReplicas: 3
  currentRevision: consul-consul-server-5f84b7b657
  updateRevision: consul-consul-server-5f84b7b657
  collisionCount: 0

错误日志

==> Starting Consul agent...
           Version: '1.8.4'
           Node ID: '017e63cc-bedf-c4c3-2a3c-0dfc0c05594a'
         Node name: 'aks-nodepool1-12257257-vmss000002'
        Datacenter: 'dc1' (Segment: '<all>')
            Server: true (Bootstrap: false)
       Client Addr: [0.0.0.0] (HTTP: 8500, HTTPS: -1, gRPC: -1, DNS: 8600)
      Cluster Addr: 10.0.0.153 (LAN: 8301, WAN: 8302)
           Encrypt: Gossip: false, TLS-Outgoing: false, TLS-Incoming: false, Auto-Encrypt-TLS: false

==> Log data will now stream in as it occurs:

    2020-12-19T09:27:59.021Z [WARN]  agent: bootstrap_expect > 0: expecting 3 servers
    2020-12-19T09:27:59.044Z [WARN]  agent.auto_config: bootstrap_expect > 0: expecting 3 servers
    2020-12-19T09:27:59.075Z [WARN]  agent.server.snapshot: found temporary snapshot: name=1164-491656-1605996371085.tmp
    2020-12-19T09:27:59.075Z [WARN]  agent.server.snapshot: found temporary snapshot: name=1796-1455900-1608237651961.tmp
    2020-12-19T09:27:59.084Z [WARN]  agent.server.snapshot: found temporary snapshot: name=7621-1555326-1608364051682.tmp
    2020-12-19T09:28:05.099Z [INFO]  agent.server.raft: restored from snapshot: id=7621-1538935-1608350048113
    2020-12-19T09:28:33.735Z [INFO]  agent.server.raft: initial configuration: index=1560707 servers="[{Suffrage:Voter ID:8bdce7bb-464f-19e6-7a36-c165917790a4 Address:10.0.0.173:8300} {Suffrage:Voter ID:804735ae-e812-a843-96a1-7140a17909b6 Address:10.0.0.143:8300}]"
    2020-12-19T09:28:33.735Z [INFO]  agent.server.raft: entering follower state: follower="Node at 10.0.0.153:8300 [Follower]" leader=
    2020-12-19T09:28:33.749Z [INFO]  agent.server.serf.wan: serf: EventMemberJoin: aks-nodepool1-12257257-vmss000002.dc1 10.0.0.153
    2020-12-19T09:28:33.749Z [INFO]  agent.server.serf.wan: serf: Attempting re-join to previously known node: aks-nodepool1-12257257-vmss000000.dc1: 10.0.0.173:8302
    2020-12-19T09:28:33.752Z [INFO]  agent.server.serf.wan: serf: EventMemberJoin: aks-nodepool1-12257257-vmss000000.dc1 10.0.0.173
    2020-12-19T09:28:33.752Z [INFO]  agent.server.serf.wan: serf: EventMemberJoin: aks-nodepool1-12257257-vmss000001.dc1 10.0.0.143
    2020-12-19T09:28:33.752Z [INFO]  agent.server.serf.wan: serf: Re-joined to previously known node: aks-nodepool1-12257257-vmss000000.dc1: 10.0.0.173:8302
    2020-12-19T09:28:33.764Z [INFO]  agent.server.serf.lan: serf: EventMemberJoin: aks-nodepool1-12257257-vmss000002 10.0.0.153
    2020-12-19T09:28:33.764Z [INFO]  agent.router: Initializing LAN area manager
    2020-12-19T09:28:33.764Z [INFO]  agent.server.serf.lan: serf: Attempting re-join to previously known node: mapserver-failover: 10.0.0.54:8301
    2020-12-19T09:28:33.764Z [INFO]  agent.server: Handled event for server in area: event=member-join server=aks-nodepool1-12257257-vmss000002.dc1 area=wan
    2020-12-19T09:28:33.764Z [INFO]  agent.server: Handled event for server in area: event=member-join server=aks-nodepool1-12257257-vmss000000.dc1 area=wan
    2020-12-19T09:28:33.764Z [INFO]  agent.server: Handled event for server in area: event=member-join server=aks-nodepool1-12257257-vmss000001.dc1 area=wan
    2020-12-19T09:28:33.764Z [INFO]  agent.server: Adding LAN server: server="aks-nodepool1-12257257-vmss000002 (Addr: tcp/10.0.0.153:8300) (DC: dc1)"
    2020-12-19T09:28:33.764Z [INFO]  agent.server: Raft data found, disabling bootstrap mode
    2020-12-19T09:28:33.769Z [INFO]  agent.server.serf.lan: serf: EventMemberJoin: image-failover 10.0.0.57
    2020-12-19T09:28:33.769Z [INFO]  agent.server.serf.lan: serf: EventMemberJoin: aks-nodepool1-12257257-vmss000000 10.0.0.173
    2020-12-19T09:28:33.769Z [INFO]  agent.server.serf.lan: serf: EventMemberJoin: mapserver 10.0.0.53
    2020-12-19T09:28:33.769Z [INFO]  agent.server: Adding LAN server: server="aks-nodepool1-12257257-vmss000000 (Addr: tcp/10.0.0.173:8300) (DC: dc1)"
    2020-12-19T09:28:33.769Z [INFO]  agent.server.serf.lan: serf: EventMemberJoin: image 10.0.0.56
    2020-12-19T09:28:33.770Z [INFO]  agent.server.serf.lan: serf: EventMemberJoin: mapserver-failover 10.0.0.54
    2020-12-19T09:28:33.770Z [INFO]  agent.server.serf.lan: serf: EventMemberJoin: aks-nodepool1-12257257-vmss000001 10.0.0.143
    2020-12-19T09:28:33.770Z [INFO]  agent.server.serf.lan: serf: EventMemberJoin: web-server-01 10.0.0.36
    2020-12-19T09:28:33.770Z [INFO]  agent.server.serf.lan: serf: EventMemberJoin: web-server-02 10.0.0.37
    2020-12-19T09:28:33.770Z [INFO]  agent.server: Adding LAN server: server="aks-nodepool1-12257257-vmss000001 (Addr: tcp/10.0.0.143:8300) (DC: dc1)"
    2020-12-19T09:28:33.770Z [INFO]  agent.server.serf.lan: serf: Re-joined to previously known node: mapserver-failover: 10.0.0.54:8301
    2020-12-19T09:28:33.778Z [INFO]  agent: Started DNS server: address=0.0.0.0:8600 network=tcp
    2020-12-19T09:28:33.778Z [INFO]  agent: Started DNS server: address=0.0.0.0:8600 network=udp
    2020-12-19T09:28:33.778Z [INFO]  agent: Started HTTP server: address=[::]:8500 network=tcp
    2020-12-19T09:28:33.778Z [INFO]  agent: started state syncer
==> Consul agent running!
    2020-12-19T09:28:33.779Z [INFO]  agent: Retry join is supported for the following discovery methods: cluster=LAN discovery_methods="aliyun aws azure digitalocean gce k8s linode mdns os packet scaleway softlayer tencentcloud triton vsphere"
    2020-12-19T09:28:33.779Z [INFO]  agent: Joining cluster...: cluster=LAN
    2020-12-19T09:28:33.779Z [INFO]  agent: (LAN) joining: lan_addresses=[consul-consul-server-0.consul-consul-server.consul.svc, consul-consul-server-1.consul-consul-server.consul.svc, consul-consul-server-2.consul-consul-server.consul.svc]
    2020-12-19T09:28:33.932Z [WARN]  agent.server.memberlist.lan: memberlist: Failed to resolve consul-consul-server-0.consul-consul-server.consul.svc: lookup consul-consul-server-0.consul-consul-server.consul.svc on 168.63.129.16:53: no such host
    2020-12-19T09:28:33.972Z [WARN]  agent.server.raft: failed to get previous log: previous-index=1561367 last-index=1561020 error="log not found"
    2020-12-19T09:28:34.101Z [INFO]  agent: Synced node info
    2020-12-19T09:28:34.114Z [INFO]  agent: Synced service: service=app_webserver_1
    2020-12-19T09:28:34.126Z [INFO]  agent: Synced service: service=administration_webserver_1
    2020-12-19T09:28:34.184Z [WARN]  agent.server.memberlist.lan: memberlist: Failed to resolve consul-consul-server-1.consul-consul-server.consul.svc: lookup consul-consul-server-1.consul-consul-server.consul.svc on 168.63.129.16:53: no such host
    2020-12-19T09:28:34.184Z [INFO]  agent: Synced service: service=app_webserver_2
    2020-12-19T09:28:34.378Z [WARN]  agent.server.memberlist.lan: memberlist: Failed to resolve consul-consul-server-2.consul-consul-server.consul.svc: lookup consul-consul-server-2.consul-consul-server.consul.svc on 168.63.129.16:53: no such host
    2020-12-19T09:28:34.378Z [WARN]  agent: (LAN) couldn't join: number_of_nodes=0 error="3 errors occurred:
    * Failed to resolve consul-consul-server-0.consul-consul-server.consul.svc: lookup consul-consul-server-0.consul-consul-server.consul.svc on 168.63.129.16:53: no such host
    * Failed to resolve consul-consul-server-1.consul-consul-server.consul.svc: lookup consul-consul-server-1.consul-consul-server.consul.svc on 168.63.129.16:53: no such host
    * Failed to resolve consul-consul-server-2.consul-consul-server.consul.svc: lookup consul-consul-server-2.consul-consul-server.consul.svc on 168.63.129.16:53: no such host

"
     
    2020-12-19T09:30:04.891Z [WARN]  agent.server.memberlist.lan: memberlist: Refuting a suspect message (from: aks-nodepool1-12257257-vmss000002)
    2020-12-19T09:30:05.287Z [WARN]  agent: Join cluster failed, will retry: cluster=LAN retry_interval=30s error=<nil>
    2020-12-19T09:30:05.927Z [INFO]  agent.server.memberlist.lan: memberlist: Suspect web-server-02 has failed, no acks received
    2020-12-19T09:30:06.786Z [INFO]  agent.server.serf.lan: serf: EventMemberJoin: web-server-01 10.0.0.36
    2020-12-19T09:30:07.390Z [WARN]  agent: Check is now critical: check=service:administration_webserver_1
    2020-12-19T09:30:08.039Z [WARN]  agent: Check is now critical: check=service:app_webserver_2
    2020-12-19T09:30:08.427Z [WARN]  agent: Check is now critical: check=service:app_webserver_1
    2020-12-19T09:30:14.487Z [WARN]  agent: Check is now critical: check=service:administration_webserver_1
    2020-12-19T09:30:15.975Z [WARN]  agent: Check is now critical: check=service:app_webserver_1
    2020-12-19T09:30:15.976Z [WARN]  agent: Check is now critical: check=service:app_webserver_2
    2020-12-19T09:30:16.134Z [WARN]  agent.server.memberlist.lan: memberlist: Was able to connect to aks-nodepool1-12257257-vmss000000 but other probes failed, network may be misconfigured
    2020-12-19T09:30:21.533Z [WARN]  agent: Check is now critical: check=service:administration_webserver_1
    2020-12-19T09:30:23.290Z [WARN]  agent: Check is now critical: check=service:app_webserver_1
    2020-12-19T09:30:23.426Z [INFO]  agent.server.memberlist.lan: memberlist: Suspect image-failover has failed, no acks received
    2020-12-19T09:30:23.581Z [WARN]  agent: Check is now critical: check=service:app_webserver_2
    2020-12-19T09:30:27.987Z [WARN]  agent: Check is now critical: check=service:administration_webserver_1
    2020-12-19T09:30:29.988Z [WARN]  agent: Check is now critical: check=service:app_webserver_1
    2020-12-19T09:30:30.094Z [WARN]  agent: Check is now critical: check=service:app_webserver_2
    2020-12-19T09:30:32.472Z [WARN]  agent.server.memberlist.lan: memberlist: Was able to connect to image but other probes failed, network may be misconfigured
    2020-12-19T09:30:32.534Z [INFO]  agent.server.memberlist.lan: memberlist: Marking web-server-02 as failed, suspect timeout reached (0 peer confirmations)
    2020-12-19T09:30:32.675Z [INFO]  agent.server.serf.lan: serf: EventMemberFailed: web-server-02 10.0.0.37
    2020-12-19T09:30:34.832Z [WARN]  agent: Check is now critical: check=service:administration_webserver_1
    2020-12-19T09:30:35.542Z [INFO]  agent: (LAN) joining: lan_addresses=[consul-consul-server-0.consul-consul-server.consul.svc, consul-consul-server-1.consul-consul-server.consul.svc, consul-consul-server-2.consul-consul-server.consul.svc]
    2020-12-19T09:30:36.929Z [WARN]  agent: Check is now critical: check=service:app_webserver_2
    2020-12-19T09:30:37.836Z [WARN]  agent: Check is now critical: check=service:app_webserver_1
    2020-12-19T09:30:40.096Z [WARN]  agent.server.memberlist.lan: memberlist: Was able to connect to mapserver-failover but other probes failed, network may be misconfigured
    2020-12-19T09:30:43.635Z [WARN]  agent: Check is now critical: check=service:app_webserver_2
    2020-12-19T09:30:44.534Z [INFO]  agent.server.serf.lan: serf: EventMemberJoin: web-server-02 10.0.0.37
    2020-12-19T09:30:45.086Z [INFO]  agent.server.memberlist.wan: memberlist: Suspect aks-nodepool1-12257257-vmss000000.dc1 has failed, no acks received
    2020-12-19T09:30:45.836Z [WARN]  agent: Check is now critical: check=service:app_webserver_1
    2020-12-19T09:30:47.386Z [WARN]  agent.server.memberlist.lan: memberlist: Refuting a suspect message (from: aks-nodepool1-12257257-vmss000002)
    2020-12-19T09:30:49.724Z [WARN]  agent.server.memberlist.lan: memberlist: Failed to resolve consul-consul-server-0.consul-consul-server.consul.svc: lookup consul-consul-server-0.consul-consul-server.consul.svc on 168.63.129.16:53: no such host
    2020-12-19T09:30:50.539Z [WARN]  agent: Check is now critical: check=service:administration_webserver_1
    2020-12-19T09:30:51.040Z [WARN]  agent: Check is now critical: check=service:app_webserver_2
    2020-12-19T09:30:51.334Z [INFO]  agent.server.memberlist.lan: memberlist: Suspect image-failover has failed, no acks received
    2020-12-19T09:30:53.929Z [WARN]  agent: Check is now critical: check=service:app_webserver_1
    2020-12-19T09:30:54.933Z [WARN]  agent.server.memberlist.wan: memberlist: Refuting a suspect message (from: aks-nodepool1-12257257-vmss000000.dc1)
    2020-12-19T09:30:55.723Z [WARN]  agent.server.memberlist.lan: memberlist: Failed to resolve consul-consul-server-1.consul-consul-server.consul.svc: lookup consul-consul-server-1.consul-consul-server.consul.svc on 168.63.129.16:53: no such host
    2020-12-19T09:30:58.039Z [WARN]  agent: Check is now critical: check=service:administration_webserver_1
    2020-12-19T09:30:58.631Z [WARN]  agent: Check is now critical: check=service:app_webserver_2
    2020-12-19T09:31:01.087Z [WARN]  agent: Check is now critical: check=service:app_webserver_1
    2020-12-19T09:31:02.088Z [WARN]  agent.server.memberlist.lan: memberlist: Failed to resolve consul-consul-server-2.consul-consul-server.consul.svc: lookup consul-consul-server-2.consul-consul-server.consul.svc on 168.63.129.16:53: no such host
    2020-12-19T09:31:02.534Z [WARN]  agent: (LAN) couldn't join: number_of_nodes=0 error="3 errors occurred:
    * Failed to resolve consul-consul-server-0.consul-consul-server.consul.svc: lookup consul-consul-server-0.consul-consul-server.consul.svc on 168.63.129.16:53: no such host
    * Failed to resolve consul-consul-server-1.consul-consul-server.consul.svc: lookup consul-consul-server-1.consul-consul-server.consul.svc on 168.63.129.16:53: no such host
    * Failed to resolve consul-consul-server-2.consul-consul-server.consul.svc: lookup consul-consul-server-2.consul-consul-server.consul.svc on 168.63.129.16:53: no such host
    2020-12-19T09:31:05.834Z [INFO]  agent.server.memberlist.lan: memberlist: Suspect mapserver-failover has failed, no acks received
    2020-12-19T09:31:05.924Z [WARN]  agent: Check is now critical: check=service:app_webserver_2
    2020-12-19T09:31:07.886Z [WARN]  agent: Check is now critical: check=service:app_webserver_1
    2020-12-19T09:31:11.930Z [WARN]  agent: Check is now critical: check=service:administration_webserver_1
    2020-12-19T09:31:12.535Z [WARN]  agent.server.memberlist.wan: memberlist: Was able to connect to aks-nodepool1-12257257-vmss000000.dc1 but other probes failed, network may be misconfigured
    2020-12-19T09:31:12.544Z [WARN]  agent: Check is now critical: check=service:app_webserver_2
    2020-12-19T09:31:14.679Z [WARN]  agent: Check is now critical: check=service:app_webserver_1
    2020-12-19T09:31:15.384Z [WARN]  agent.server.memberlist.lan: memberlist: Was able to connect to web-server-01 but other probes failed, network may be misconfigured
    2020-12-19T09:31:18.586Z [WARN]  agent: Check is now critical: check=service:administration_webserver_1
    2020-12-19T09:31:19.085Z [WARN]  agent: Check is now critical: check=service:app_webserver_2
    2020-12-19T09:31:21.229Z [WARN]  agent: Check is now critical: check=service:app_webserver_1
    2020-12-19T09:31:22.583Z [WARN]  agent.server.memberlist.lan: memberlist: Was able to connect to aks-nodepool1-12257257-vmss000000 but other probes failed, network may be misconfigured
    2020-12-19T09:31:22.723Z [INFO]  agent: Synced check: check=service:administration_webserver_1
    2020-12-19T09:31:23.137Z [INFO]  agent: Synced check: check=service:app_webserver_1
    2020-12-19T09:31:23.285Z [WARN]  agent.server.memberlist.lan: memberlist: Refuting a suspect message (from: mapserver-failover)
    2020-12-19T09:31:23.729Z [INFO]  agent: Synced check: check=service:app_webserver_2
    2020-12-19T09:31:25.229Z [WARN]  agent: Check is now critical: check=service:administration_webserver_1
    2020-12-19T09:31:25.579Z [WARN]  agent: Check is now critical: check=service:app_webserver_2
    2020-12-19T09:31:27.485Z [WARN]  agent: Check is now critical: check=service:app_webserver_1
    2020-12-19T09:31:28.675Z [WARN]  agent.server.memberlist.lan: memberlist: Refuting a suspect message (from: web-server-01)
    2020-12-19T09:31:31.137Z [INFO]  agent: Synced check: check=service:administration_webserver_1
    2020-12-19T09:31:31.532Z [INFO]  agent: Synced check: check=service:app_webserver_2
    2020-12-19T09:31:31.698Z [INFO]  agent.server.fsm: snapshot created: duration=5.743467ms
    2020-12-19T09:31:32.922Z [ERROR] agent.server.raft: failed to flush response: error="write tcp 10.0.0.153:8300->10.0.0.143:35822: write: broken pipe"
    2020-12-19T09:31:32.927Z [WARN]  agent.server.raft: skipping application of old log: index=1561084
    2020-12-19T09:31:33.038Z [ERROR] agent.server.raft: failed to flush response: error="write tcp 10.0.0.153:8300->10.0.0.143:35824: write: broken pipe"
    2020-12-19T09:31:33.091Z [WARN]  agent.server.raft: skipping application of old log: index=1561084
    2020-12-19T09:31:33.171Z [ERROR] agent.server.raft: failed to flush response: error="write tcp 10.0.0.153:8300->10.0.0.143:35948: write: broken pipe"
    2020-12-19T09:31:33.232Z [ERROR] agent.server.raft: failed to take snapshot: error="cannot take snapshot now, wait until the configuration entry at 1560707 has been applied (have applied 1547895)"
    2020-12-19T09:31:33.292Z [WARN]  agent.server.raft: skipping application of old log: index=1561084
    2020-12-19T09:31:33.377Z [WARN]  agent.server.raft: failed to get previous log: previous-index=1561406 last-index=1561084 error="log not found"
    2020-12-19T09:31:33.378Z [ERROR] agent.server.raft: failed to flush response: error="write tcp 10.0.0.153:8300->10.0.0.143:36164: write: broken pipe"
    2020-12-19T09:31:33.626Z [ERROR] agent.server.raft: failed to flush response: error="write tcp 10.0.0.153:8300->10.0.0.143:36064: write: broken pipe"
    2020-12-19T09:31:33.627Z [INFO]  agent: Synced check: check=service:app_webserver_1
    2020-12-19T09:31:33.677Z [WARN]  agent.server.raft: skipping application of old log: index=156108
    2020-12-19T09:31:33.725Z [INFO]  agent: (LAN) joining: lan_addresses=[consul-consul-server-0.consul-consul-server.consul.svc, consul-consul-server-1.consul-consul-server.consul.svc, consul-consul-server-2.consul-consul-server.consul.svc]
    2020-12-19T09:31:33.831Z [ERROR] agent.server.raft: failed to flush response: error="write tcp 10.0.0.153:8300->10.0.0.143:36234: write: broken pipe"
    2020-12-19T09:31:33.833Z [ERROR] agent.server.raft: failed to flush response: error="write tcp 10.0.0.153:8300->10.0.0.143:36340: write: broken pipe"
    2020-12-19T09:31:33.833Z [ERROR] agent.server.raft: failed to flush response: error="write tcp 10.0.0.153:8300->10.0.0.143:36444: write: broken pipe"
    2020-12-19T09:31:33.833Z [ERROR] agent.server.raft: failed to flush response: error="write tcp 10.0.0.153:8300->10.0.0.143:36656: write: broken pipe"
    2020-12-19T09:31:33.833Z [ERROR] agent.server.raft: failed to flush response: error="write tcp 10.0.0.153:8300->10.0.0.143:36440: write: broken pipe"

标签: kubernetesconsul

解决方案


推荐阅读