首页 > 解决方案 > Elastiscsearch 索引作为来自 Logstash 的文件名 - 引用问题

问题描述

我想将 ElasticSearch 中的索引作为上传文件的名称。我按照这里建议的答案 - Logstash filename as ElasticSearch index。但是,我没有将索引作为文件名接收,而是得到引号之间的索引 - %{index_name}。我究竟做错了什么?

更新 - 我的 syslog.conf:

input {
  beats {
    port => 5044
  }
  udp {
    port => 514
    type => "syslog"
  }
  file {
       path => "C:\web-developement\...\data\*.log"
       start_position => "beginning"
       type => "logs"
   }
}

filter {    
    grok {
     match => ["path", "data/%{GREEDYDATA:index_name}" ]
    }
}

output {
  elasticsearch { 
      hosts => ["localhost:9200"] 
      index => "%{index_name}"
      manage_template => false
  }
  stdout { codec => rubydebug }
}

UPD 2 - Logstash 输出:

WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by com.headius.backport9.modules.Modules (file:/C:/ELK/logstash-7.6.2/logstash-7.6.2/logstash-core/lib/jars/jruby-complete-9.2.9.0.jar) to field java.io.Console.cs
WARNING: Please consider reporting this to the maintainers of com.headius.backport9.modules.Modules
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Sending Logstash logs to C:/ELK/logstash-7.6.2/logstash-7.6.2/logs which is now configured via log4j2.properties
[2020-06-10T17:37:34,552][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2020-06-10T17:37:34,670][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"7.6.2"}
[2020-06-10T17:37:36,144][INFO ][org.reflections.Reflections] Reflections took 37 ms to scan 1 urls, producing 20 keys and 40 values
[2020-06-10T17:37:37,535][INFO ][logstash.outputs.elasticsearch][main] Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://localhost:9200/]}}
[2020-06-10T17:37:37,704][WARN ][logstash.outputs.elasticsearch][main] Restored connection to ES instance {:url=>"http://localhost:9200/"}
[2020-06-10T17:37:37,752][INFO ][logstash.outputs.elasticsearch][main] ES Output version determined {:es_version=>7}
[2020-06-10T17:37:37,755][WARN ][logstash.outputs.elasticsearch][main] Detected a 6.x and above cluster: the `type` event field won't be used to determine the document _type {:es_version=>7}
[2020-06-10T17:37:37,838][INFO ][logstash.outputs.elasticsearch][main] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["//localhost:9200"]}
[2020-06-10T17:37:38,020][WARN ][org.logstash.instrument.metrics.gauge.LazyDelegatingGauge][main] A gauge metric of an unknown type (org.jruby.specialized.RubyArrayOneObject) has been created for key: cluster_uuids. This may result in invalid serialization.  It is recommended to log an issue to the responsible developer/development team.
[2020-06-10T17:37:38,025][INFO ][logstash.javapipeline    ][main] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>8, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>1000, "pipeline.sources"=>["C:/ELK/logstash-7.6.2/logstash-7.6.2/config/syslog.conf"], :thread=>"#<Thread:0x577ae8e run>"}
[2020-06-10T17:37:38,798][INFO ][logstash.inputs.beats    ][main] Beats inputs: Starting input listener {:address=>"0.0.0.0:5044"}
[2020-06-10T17:37:39,237][INFO ][logstash.inputs.file     ][main] No sincedb_path set, generating one based on the "path" setting {:sincedb_path=>"C:/ELK/logstash-7.6.2/logstash-7.6.2/data/plugins/inputs/file/.sincedb_029446dc83f19d43b8822e485aa6e7a4", :path=>["C:\\web-developement\\project\\data\\*.log"]}
[2020-06-10T17:37:39,263][INFO ][logstash.javapipeline    ][main] Pipeline started {"pipeline.id"=>"main"}
[2020-06-10T17:37:39,312][INFO ][logstash.inputs.udp      ][main] Starting UDP listener {:address=>"0.0.0.0:514"}
[2020-06-10T17:37:39,353][INFO ][filewatch.observingtail  ][main] START, creating Discoverer, Watch with file and sincedb collections
[2020-06-10T17:37:39,378][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[2020-06-10T17:37:39,390][INFO ][logstash.inputs.udp      ][main] UDP listener started {:address=>"0.0.0.0:514", :receive_buffer_bytes=>"65536", :queue_size=>"2000"}
[2020-06-10T17:37:39,404][INFO ][org.logstash.beats.Server][main] Starting server on port: 5044
[2020-06-10T17:37:39,665][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}

此外,更新日志文件时的输出给我 _grokparsefailure 行。

...
 "@timestamp" => 2020-06-10T14:44:11.390Z,
         "event" => {
        "timezone" => "+03:00",
         "dataset" => "logstash.log",
          "module" => "logstash"
    },
          "tags" => [
        [0] "beats_input_codec_plain_applied",
        [1] "_grokparsefailure"
    ],
      "@version" => "1"
}

标签: elasticsearchindexinglogstash

解决方案


在您的过滤器中尝试以下 grok 模式。

grok {
    match => ["path", "C:\\%{GREEDYDATA}\\%{GREEDYDATA:index_name}.log"]
}

这将匹配任何以开头的路径,C:\并将提取文件名并将其存储在字段中index_name

例如,对于 a path = C:\Web-development\tests\filename001.logindex_name将是filename001

如果您的任何文件有大写字母,则需要使用mutate 过滤器将其转换index_name为小写,索引名称中不能有大写字母,如果文件名中有空格,则还需要使用mutate 过滤器删除空格,索引名称中不能有空格,这些是一些限制


推荐阅读