首页 > 解决方案 > 通过fluentD限制相同日志的数量

问题描述

用例:设置要发送到目标服务的最大消息数(在一个时间范围内)。

例子。我们从具有以下类型日志的服务 X 收集日志:

{"@timestamp":"2020-10-30T13:00:00.310Z","level":"INFO","message":"This is some event"}
{"@timestamp":"2020-10-30T13:00:00.315Z","level":"WARN","message":"This is warn abc123"}
{"@timestamp":"2020-10-30T13:00:00.325Z","level":"WARN","message":"This is warn abc123"}
{"@timestamp":"2020-10-30T13:00:00.327Z","level":"WARN","message":"This is warn abc123"}
{"@timestamp":"2020-10-30T13:00:00.335Z","level":"WARN","message":"This is warn xyz123"}

如您所见,服务在 12 毫秒内多次记录了相同的警告 (abc123)。我想要的是只从他们那里发送一个。

所以 fluentD 应该将这些转发到目标服务:

{"@timestamp":"2020-10-30T13:00:00.310Z","level":"INFO","message":"This is some event"}
{"@timestamp":"2020-10-30T13:00:00.315Z","level":"WARN","message":"This is warn abc123"}
{"@timestamp":"2020-10-30T13:00:00.335Z","level":"WARN","message":"This is warn xyz123"}

使用哪个时间戳或拥有一个计数器对我来说并不重要。

这个用例有过滤器,插件吗?像我可以在哪里为消息设置正则表达式规则(用于决定是否应将更多消息视为相等)和时间范围之类的东西?

标签: fluentd

解决方案


在流利的人可以尝试节流插件https://github.com/rubrikinc/fluent-plugin-throttle用一个message键作为一个group_key(在这种情况下不确定性能)。

在 FluentBit 中,您可以使用内置的 SQL 流处理器并编写SELECTwithWINDOWGROUP BY语句:https ://docs.fluentbit.io/stream-processing/getting_started/fluent_bit_sql#select-statement 。


推荐阅读