首页 > 解决方案 > Filebeat 初始化失败,出现 10.96.0.1:443 i/o 超时错误

问题描述

在我的 k8s 集群中,节点重启后 filebeat 连接失败。其他k8s节点正常工作。

来自 filebeat pod 的日志:

2020-08-30T03:18:58.770Z    ERROR   kubernetes/util.go:90   kubernetes: Querying for pod failed with error: performing request: Get https://10.96.0.1:443/api/v1/namespaces/monitoring/pods/filebeat-gfg5l: dial tcp 10.96.0.1:443: i/o timeout
2020-08-30T03:18:58.770Z    INFO    kubernetes/watcher.go:180   kubernetes: Performing a resource sync for *v1.PodList
2020-08-30T03:19:28.771Z    ERROR   kubernetes/watcher.go:183   kubernetes: Performing a resource sync err performing request: Get https://10.96.0.1:443/api/v1/pods?fieldSelector=spec.nodeName%3Dlocalhost&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout for *v1.PodList
2020-08-30T03:19:28.771Z    INFO    instance/beat.go:357    filebeat stopped.
2020-08-30T03:19:28.771Z    ERROR   instance/beat.go:800    Exiting: error initializing publisher: error initializing processors: performing request: Get https://10.96.0.1:443/api/v1/pods?fieldSelector=spec.nodeName%3Dlocalhost&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
Exiting: error initializing publisher: error initializing processors: performing request: Get https://10.96.0.1:443/api/v1/pods?fieldSelector=spec.nodeName%3Dlocalhost&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

发生错误并重复 pod 重新启动。我也重新启动了这个节点,但它没有工作。

filebeat 版本为 6.5.2 并使用 daemonset 进行部署。有没有这样的已知问题?

除了 filebeat 之外的所有 pod 在该节点上工作都没有问题。

更新:

apiVersion: v1
data:
  filebeat.yml: |-
    filebeat.inputs:
    - type: docker
      multiline.pattern: '^[[:space:]]+'
      multiline.negate: false
      multiline.match: after
      symlinks: true
      cri.parse_flags: true
      containers:
        ids: [""]
        path: "/var/log/containers"
    processors:
    - decode_json_fields:
        fields: ["message"]
        process_array: false
        max_depth: 1
        target: message_json
        overwrite_keys: false
        when:
          contains:
            source: "/var/log/containers/app"
    - add_kubernetes_metadata:
        in_cluster: true
        default_matchers.enabled: false
        matchers:
        - logs_path:
            logs_path: /var/log/containers/
    output:
      logstash:
        hosts:
        - logstash:5044
kind: ConfigMap
metadata:
  creationTimestamp: "2020-01-06T09:31:31Z"
  labels:
    k8s-app: filebeat
  name: filebeat-config
  namespace: monitoring
  resourceVersion: "6797684985"
  selfLink: /api/v1/namespaces/monitoring/configmaps/filebeat-config
  uid: 52d86bbb-3067-11ea-89c6-246e96da5c9c

标签: elasticsearchkuberneteslogstashfilebeat

解决方案


查询add_kubernetes_metadata失败。https://10.96.0.1:443/api/v1/pods?fieldSelector=spec.nodeName%3Dlocalhost&resourceVersion=0正如上面讨论的结果,这可以通过重新启动解决临时网络接口问题的 Beat 来解决。


推荐阅读