ruby - 配置文件,logstash ruby filter event.get("message").match() 错误
问题描述
在 logstash 配置文件中,我试图只获取要解析的 XML 数据。
这是我的配置文件:
input {
file {
path => "/home/elastic-stack/logstash-7.3.2/event-data/telmetry.log"
start_position => "beginning"
type => "sandbox-out"
codec => multiline {
pattern => "^</datastore-contents-xml>"
negate => "true"
what => "next"
}
}
http {
host => "127.0.0.1"
port => 8080
type => "sandbox-out"
}
}
filter {
grok {
match => { "message" => "\[%{USER:host_name} %{IP:ip_address} %{USER:session-id} %{NUMBER:session-id-num}\]"}
}
grok {
match => { "message" => "\Subscription Id \: %{BASE16NUM:subcription-id:int}"}
}
grok {
match => { "message" => "\Event time \: %{TIMESTAMP_ISO8601:event-time}"}
}
grok {
match => {"message" => "\<%{USERNAME:Statistic}\>"}
}
mutate {
remove_field => ["headers", "host_name", "session-id","message"]
}
date {
match => ["timestamp","dd/MMM/yyyy:HH:mm:ss Z"]
}
ruby { code => 'event.set("justXml", event.get("message").match(/.+(<datastore-contents-xml.*)/m)[1])' }
xml {
#remove_namespaces => "true"
#not even the namspace option is working to access the http link
source => "justXml"
target => "xml-content"
#force_array => "false"
xpath => [
"//*[name()='datastore-contents-xml']/*[name()='memory-statistics']/*[name()='memory-statistic'][1]/*[name()='name']/text()" , "name" ,
"//*[name()='datastore-contents-xml']/*[name()='memory-statistics']/*[name()='memory-statistic'][1]/*[name()='total-memory']/text()" , "total-memory",
"//*[name()='datastore-contents-xml']/*[name()='memory-statistics']/*[name()='memory-statistic'][1]/*[name()='used-memory']/text()" , "used-memory",
"//*[name()='datastore-contents-xml']/*[name()='memory-statistics']/*[name()='memory-statistic'][1]/*[name()='free-memory']/text()" , "free-memory" ,
"//*[name()='datastore-contents-xml']/*[name()='memory-statistics']/*[name()='memory-statistic'][1]/*[name()='lowest-memory']/text()" , "lowest-memory" ,
"//*[name()='datastore-contents-xml']/*[name()='memory-statistics']/*[name()='memory-statistic'][1]/*[name()='highest-memory']/text()" , "highest-memory"
]
#logstash is not dectecting any of these xpaths in the config
}
mutate {
convert => {
"total-memory" => "integer"
"used-memory" => "integer"
"free-memory" => "integer"
"lowest-memory" => "integer"
"highest-memory" => "integer"
}
}
}
output {
stdout {
codec => rubydebug
}
file {
path => "%{type}_%{+dd_MM_yyyy}.log"
}
}
期望的输出:
{
"ip_address" => "10.10.20.30",
"subcription-id" => 2147483650,
"event-time" => "2019-09-12 13:13:30.290000+00:00",
"host" => "127.0.0.1",
"Statistic" => "memory-statistic",
"type" => "sandbox-out",
"@version" => "1",
"@timestamp" => 2019-09-26T10:03:00.620Z,
"session-id-num" => "35"
"yang-model" => "http://cisco.com/ns/yang/Cisco-IOS-XE-memory-oper"
"name" => "Processor"
"total-memory" => 2238677360
"used-memory" => 340449924
"free-memory" => 1898227436
"lowest-usage" => 1897220640
"highest-usage" => 1264110388
}
错误:
[2019-09-27T09:18:55,622][ERROR][logstash.filters.ruby ] Ruby exception occurred: undefined method `match' for nil:NilClass
/home/elastic-stack/logstash-7.3.2/vendor/bundle/jruby/2.5.0/gems/awesome_print-1.7.0/lib/awesome_print/formatters/base_formatter.rb:31: warning: constant ::Fixnum is deprecated
{
"ip_address" => "10.10.20.30",
"subcription-id" => 2147483650,
"session-id-num" => "35",
"tags" => [
[0] "_rubyexception"
],
"Statistic" => "memory-statistic",
"event-time" => "2019-09-12 13:13:30.290000+00:00",
"type" => "sandbox-out",
"@version" => "1",
"host" => "127.0.0.1",
"@timestamp" => 2019-09-27T07:18:54.868Z
通过错误,我已经知道问题出在 ruby 过滤器上,但我不知道如何解决它。
此数据由 Cisco Telemetry 生成,我正在尝试使用 Elastic Stack 获取它。
解决方案
错误似乎是该事件没有message
字段,因此您不能调用match
不存在的事物。我看到你在这个 ruby 代码中调用match
了这个字段:message
ruby { code => 'event.set("justXml", event.get("message").match(/.+(<datastore-contents-xml.*)/m)[1])' }
message
但是,您在前几行从事件中删除了该字段:
mutate {
remove_field => ["headers", "host_name", "session-id","message"]
}
解决方案是仅在不再需要消息字段时才删除它,我会将 remove_field mutate 移到filter
块的末尾。
如果我可以添加,还有一个建议。您有多个 grok 过滤器在同一个消息字段上运行:
grok {
match => { "message" => "\[%{USER:host_name} %{IP:ip_address} %{USER:session-id} %{NUMBER:session-id-num}\]"}
}
grok {
match => { "message" => "\Subscription Id \: %{BASE16NUM:subcription-id:int}"}
}
grok {
match => { "message" => "\Event time \: %{TIMESTAMP_ISO8601:event-time}"}
}
grok {
match => {"message" => "\<%{USERNAME:Statistic}\>"}
}
这可以简化为(您可以查看Grok 过滤器文档:
grok {
break_on_match => false,
match => {
"message" => [
"\[%{USER:host_name} %{IP:ip_address} %{USER:session-id} %{NUMBER:session-id-num}\]",
"\Subscription Id \: %{BASE16NUM:subcription-id:int}",
"\Event time \: %{TIMESTAMP_ISO8601:event-time}",
"\<%{USERNAME:Statistic}\>"
]
}
}
这样你只需要一个 grok 过滤器的实例,因为它会遍历列表中的模式并且因为break_on_match=>false
它不会在第一次成功匹配后完成,但会确保根据所有模式提取它可以提取的所有字段在列表中。
推荐阅读
- json - 如何注册要与detectron2 一起使用的数据集?我们有 COCO JSON 格式的图像及其注释
- mysql - if-else 在 where 条件下 - MySQL
- python - 从零开始的 3 层神经网络(调试)
- ios - 如何快速获取可选响应?
- spring - 如何首先验证 Spring @PathVariable 属性?
- rest - 在泽西岛,如何返回与 200 响应对象不同的 400 响应代码的对象
- flutter - 在 web 上运行时,flame_tiled 和/或 tiled 包会出错
- mysql - MySql 计算多列
- python - 如何从单个数据帧切片和创建多个熊猫数据帧
- ios - iOS 14 设备端语音识别