首页 > 解决方案 > Logstash 中多个 End 事件的经过时间

问题描述

我正在使用 Elastic Search 来解析数据。下面是我的示例日志文件,我正在使用下面的 Ruby 过滤器来查找给定 ID 的开始和结束之间经过的时间(** 是 Saved 或 Modified )**

示例日志文件

       TIMESTAMP            EVENT    ID
Apr 28, 2020 @ 15:17:22.337 Start    1
Apr 28, 2020 @ 15:17:23.215 Saved    1
Apr 28, 2020 @ 15:17:24.440 Start    2
Apr 28, 2020 @ 15:17:24.964 Saved    2
Apr 28, 2020 @ 15:17:25.359 Modified 2
Apr 28, 2020 @ 16:18:29.587 Start    3
Apr 28, 2020 @ 16:18:31.562 Saved    3
Apr 28, 2020 @ 16:18:31.914 Modified 3
Apr 28, 2020 @ 20:07:52.946 Start    4
Apr 28, 2020 @ 20:07:53.304 Saved    4

红宝石过滤器

ruby {
        code => "
            if event.get('Event') == 'Start'
                @save_the_timestamp = event.get('@timestamp')
                @save_the_ID = event.get('ID')
            elsif event.get('LOGLEVEL2') == 'Saved' && event.get('ID') == @save_the_ID
                event.set('elapsed_time', event.get('@timestamp') - @save_the_timestamp)
            elsif event.get('LOGLEVEL2') == 'Modified' && event.get('ID') == @save_the_ID
                event.set('elapsed_time', event.get('@timestamp') - @save_the_timestamp)
            end
        "
    }

使用 Ruby 过滤器的当前输出

       TIMESTAMP            EVENT       ID   elapsed_time
Apr 28, 2020 @ 15:17:22.337 Start       1       -
Apr 28, 2020 @ 15:17:23.215 Saved       1     0.878
Apr 28, 2020 @ 15:17:24.440 Start       2       -
Apr 28, 2020 @ 15:17:24.964 Saved       2     0.524
Apr 28, 2020 @ 15:17:25.359 Modified    2     0.919
Apr 28, 2020 @ 16:18:29.587 Start       3       -       
Apr 28, 2020 @ 16:18:31.562 Saved       3     1.975 
Apr 28, 2020 @ 16:18:31.914 Modified    3     2.327 
Apr 28, 2020 @ 20:07:52.946 Start       4       -
Apr 28, 2020 @ 20:07:53.304 Saved       4     0.358

但是,我希望给定 ID 有一个 elapsed_time。这有助于我在 Kibana 中轻松进行可视化。聚合执行此操作,但将其作为单独的事件推送。在给定 ID 的最后一个事件中是否有任何方法,如下面的所需输出。

任何帮助表示赞赏。提前致谢。

期望的输出

       TIMESTAMP            EVENT       ID   elapsed_time
Apr 28, 2020 @ 15:17:22.337 Start       1       -
Apr 28, 2020 @ 15:17:23.215 Saved       1     0.878
Apr 28, 2020 @ 15:17:24.440 Start       2       -
Apr 28, 2020 @ 15:17:24.964 Saved       2       -
Apr 28, 2020 @ 15:17:25.359 Modified    2     0.919
Apr 28, 2020 @ 16:18:29.587 Start       3       -       
Apr 28, 2020 @ 16:18:31.562 Saved       3       -   
Apr 28, 2020 @ 16:18:31.914 Modified    3     2.327 
Apr 28, 2020 @ 20:07:52.946 Start       4       -
Apr 28, 2020 @ 20:07:53.304 Saved       4     0.358

标签: logstashlogstash-configuration

解决方案


只是偶然偶然发现了这个问题。没有更多上下文,这就是我的想法。查看您的条件语句,您正在打印已保存或修改的事件。

根据您的“当前输出”与“所需输出”和您的条件逻辑,您可以通过存储每个 ID 保存或修改值的单个“最高值”来实现您的目标。

也就是说,每次我看到“保存”或“修改”而不是仅仅设置输出时,将其添加到数据结构中,然后运行您的日志输出。下面是一些写得不好的伪代码。

根据您运行此日志记录的次数,本地 Redis 实例或 Firestore 之类的云解决方案非常适合保存该哈希。

// create an empty hash
// we'll use to store our ID's and timestamps
some_hash => {}

// observe a save or modify and get the id

// see if the id is in the hash
if some_hash.include? unique_id
  //if it does include it, then update the value
  if current_observed_value > some_hash[:unique_id]
    some_hash[:unique_id] = current_observed_value // update the value with higher value
  end
else
  some_hash[:unique_id] = current_observed_value // first time saving the value
end

//you could be a bit more clever and say something like:
if unique_id is NOT in hash or current_observed_value > some_hash[:unique_id]
  some_hash[:unique_id] = current_observed_value

推荐阅读