首页 > 解决方案 > Logstash 中的日志未正确解析

问题描述

我的架构是

Filebeat A(远程)> Logstash A(2 个管道)> Elasticsearch A > Kibana A

Filebeat B(远程)> Logstash A(2 个管道)> Elasticsearch A > Kibana A

它用于日志分析。

说我的日志格式是abc_logs-yyyy.mm.dd.log

Filebeats正在将日志推送到Logstash(我可以在数据/注册表文件中看到),但Logstash没有为某些 日志文件创建索引。

比如说,abc_logs-2019.11.02.log在我的日志位置是否存在并且也Filebeat将其推送到Logstash. 但我看不到Elasticsearch.

示例日志:

<ip> <ip> 27 27 <ip> HTTP/1.1 - GET 8380 - GET /healthcheck/healthcheck.do HTTP/1.1 200 - [12/Nov/2019:00:33:49 +0000] - /healthcheck/healthcheck.do houston.hp.com 0 0.000 default task-245 "-" "-" "-" "-" "-" "-"
<ip> <ip> 42 42 <ip> HTTP/1.1 - POST 8743 - POST /ContactServices/api/contact/create HTTP/1.1 200 - [12/Nov/2019:07:00:54 +0000] - /ContactServices/api/contact/create - 1969 1.969 default task-199 "-" "application/json" "-" "-" "-" "-"

logstash.conf 文件:

input {
 beats {
                port => 5044
                host => "<host_name>"
        }
}

filter {
  grok {
        match => ["message", '%{IPV4:remoteIP}\s+%{IPV4:localIP}\s+%{INT:throughtputData:int}\s+%{INT}\s+%{IPV4}\s+%{DATA:requestProtocol}\s+%{DATA:remoteLogicalUserName}\s+%{DATA:requestMethod}\s+%{DATA:port}\s+%{DATA}\s+%{DATA}\s+/ContactServices/api/%{DATA}\s+%{DATA:requestProtocol2}\s+%{INT:requestStatusCode}\s+%{DATA:userSessionID}\s+\[%{HTTPDATE:logTimeStamp}\]\s+%{DATA:remoteUser}\s+/ContactServices/api/%{DATA:requestedURL2}\s+%{DATA:serverName}\s+%{INT:timeTakenInMilliSec:int}\s+%{NUMBER}\s+default\s+task-%{INT}\s+"%{DATA:authorization}"\s+"%{DATA}"\s+"%{DATA}"\s+"%{DATA}"\s+"%{DATA}"\s+"%{DATA}"']
 }

        if "_grokparsefailure" in [tags]{
                drop {}
          }

        if "_groktimeout" in [tags]{
                drop {}
          }

        date {
                match => ["logTimeStamp" ,"dd/MMM/yyyy:HH:mm:ss Z" ]
        }
        mutate{
          remove_field => ["message","host","input","type","@version","prospector","beat","garbageData","offset"]
        }
}

output {
  elasticsearch {
    hosts => ["<ip>:9202"]
    index => "contact-logs-%{+YYYY.MM.dd}"
 }
}

Filebeat.conf 文件

filebeat.inputs:
- type: log
  enabled: true
  paths:
    - /path/to/logs/*.log
  exclude_lines: ['.*healthcheck.*','.*swagger.*']

output.logstash:
  hosts: ["<serverip>:5044"]

另外,一个额外的问题。

即使创建了索引,也不是所有有效的日志都被解析了

就像如果一个日志文件有100 correct log行(如文件grok filter模式 logstash.confonly 60%-70% data在 Elasticsearch 中显示为文档.. 周围40% data is getting dropped.. 我不知道确切的原因是什么..

如果我使用指定的 grok 模式签入unparsed logsgrok debugger解析完美

这个问题有什么解决办法吗?

标签: elasticsearchlogstashkibanalogstash-grokfilebeat

解决方案


推荐阅读