首页 > 解决方案 > 对 fluentd td-agent 进程拒绝的连接进行故障排除

问题描述

在我的服务器上使用 fluentd 我想将数据(存储在多个 .txt 文件中)发送到另一台主机 host1

这是配置文件/etc/td-agent/conf.d/dump.conf

<source>
  @type tail
  path /data/dump/**/*.txt
  pos_file /data/dump/spool/dump.log
  tag data.dump
  format none
</source>

<server>
  name host1
  host 10.yyy.xx.186
  port 24224
  weight 60
</server>

重新启动 td-agent 守护程序后一切正常,但未发送数据,并且我在日志中收到“连接被拒绝”错误

尾 -n 1000 fluentd.log

2021-03-10 11:11:58 +0100 [info]: #0 following tail of /data/dump/dump.agent.dump.logs.apache.error.log/2020/12/08/dump/logs/apache/error.log/202012080101_0.txt
2021-03-10 11:11:58 +0100 [info]: #0 fluentd worker is now running worker=0
2021-03-10 11:11:59 +0100 [warn]: #0 failed to flush the buffer. retry_time=1 next_retry_seconds=2021-03-10 11:12:00 +0100 chunk="5bd28222fe5d47044eab3615df091bd6" error_class=Errno::ECONNREFUSED error="Connection refused - connect(2) for \"10.yyy.xx.186\" port 24224"

这让我感到惊讶,因为我可以毫无问题地从我的服务器 ping host1

ping 10.yyy.xx.186
PING 10.yyy.xx.186 (10.yyy.xx.186) 56(84) bytes of data.
64 bytes from 10.yyy.xx.186: icmp_seq=1 ttl=64 time=0.153 ms
64 bytes from 10.yyy.xx.186: icmp_seq=2 ttl=64 time=0.179 ms
64 bytes from 10.yyy.xx.186: icmp_seq=3 ttl=64 time=0.153 ms
^C
--- 10.yyy.xx.186 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2015ms
rtt min/avg/max/mdev = 0.153/0.161/0.179/0.019 ms

我的问题是如何重现出现在 td-agent 日志文件中的连接被拒绝?

编辑

根据@Azeem 的消息,我debug在 fluentd conf 文件中添加了一个 log_level

<source>
  @type tail
  path /data/dump/**/*.txt
  pos_file /data/dump/spool/dump.log
  tag data.dump
  format none
  @log_level debug
</source>

<server>
  name host1
  host 10.yyy.xx.186
  port 24224
  weight 60
</server>

这是从初始化部分到发生错误的时间的日志。

  2021-03-11 09:34:49 +0100 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/plugin_helper/socket.rb:59:in `initialize'
  2021-03-11 09:34:49 +0100 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/plugin_helper/socket.rb:59:in `new'
  2021-03-11 09:34:49 +0100 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/plugin_helper/socket.rb:59:in `socket_create_tcp'
  2021-03-11 09:34:49 +0100 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/plugin/out_forward.rb:366:in `create_transfer_socket'
  2021-03-11 09:34:49 +0100 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/plugin/out_forward.rb:678:in `send_data'
  2021-03-11 09:34:49 +0100 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/plugin/out_forward.rb:295:in `block in write'
  2021-03-11 09:34:49 +0100 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/plugin/out_forward.rb:330:in `block in select_a_healthy_node'
  2021-03-11 09:34:49 +0100 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/plugin/out_forward.rb:324:in `times'
  2021-03-11 09:34:49 +0100 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/plugin/out_forward.rb:324:in `select_a_healthy_node'
  2021-03-11 09:34:49 +0100 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/plugin/out_forward.rb:295:in `write'
  2021-03-11 09:34:49 +0100 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/plugin/output.rb:1125:in `try_flush'
  2021-03-11 09:34:49 +0100 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/plugin/output.rb:1425:in `flush_thread_run'
  2021-03-11 09:34:49 +0100 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/plugin/output.rb:454:in `block (2 levels) in start'
  2021-03-11 09:34:49 +0100 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'
2021-03-11 09:34:49 +0100 [info]: #0 following tail of /data/dump/webmin/miniserv.error
2021-03-11 09:34:49 +0100 [info]: #0 following tail of /data/dump/apache/error.log
2021-03-11 09:34:49 +0100 [info]: #0 following tail of /data/dump/apache/error.2021-03-11.log
2021-03-11 09:34:49 +0100 [info]: #0 following tail of /data/dump/apache/access.log
2021-03-11 09:34:49 +0100 [info]: #0 following tail of /data/dump/apache/access.2021-03-11.log
2021-03-11 09:34:49 +0100 [info]: #0 fluentd worker is now running worker=0
2021-03-11 09:34:50 +0100 [warn]: #0 failed to flush the buffer. retry_time=1 next_retry_seconds=2021-03-11 09:34:51 +0100 chunk="5bd28222fe5d47044eab3615df091bd6" error_class=Errno::ECONNREFUSED error="Connection refused - connect(2) for \"10.yyy.xx.186\" port 24224"
  2021-03-11 09:34:50 +0100 [warn]: #0 suppressed same stacktrace
2021-03-11 09:34:51 +0100 [warn]: #0 failed to flush the buffer. retry_time=2 next_retry_seconds=2021-03-11 09:34:53 +0100 chunk="5bd28222fe5d47044eab3615df091bd6" error_class=Errno::ECONNREFUSED error="Connection refused - connect(2) for \"10.yyy.xx.186\" port 24224"
  2021-03-11 09:34:51 +0100 [warn]: #0 suppressed same stacktrace
2021-03-11 09:34:53 +0100 [warn]: #0 failed to flush the buffer. retry_time=3 next_retry_seconds=2021-03-11 09:34:56 +0100 chunk="5bd28222fe5d47044eab3615df091bd6" error_class=Errno::ECONNREFUSED error="Connection refused - connect(2) for \"10.yyy.xx.186\" port 24224"
  2021-03-11 09:34:53 +0100 [warn]: #0 suppressed same stacktrace
2021-03-11 09:34:56 +0100 [warn]: #0 failed to flush the buffer. retry_time=4 next_retry_seconds=2021-03-11 09:35:04 +0100 chunk="5bd28222fe5d47044eab3615df091bd6" error_class=Errno::ECONNREFUSED error="Connection refused - connect(2) for \"10.yyy.xx.186\" port 24224"
  2021-03-11 09:34:56 +0100 [warn]: #0 suppressed same stacktrace

标签: sshfluentd

解决方案


推荐阅读