首页 > 解决方案 > 无法使用 Logstash XML 过滤器将 XML 文件摄取到 Elastic Search

问题描述

我有这个 XML 文件,我存储在 Window 10 的 D:\ 中:

<?xml version="1.0" encoding="UTF-8"?>
<root>
    <ChainId>7290027600007</ChainId>
    <SubChainId>001</SubChainId>
    <StoreId>001</StoreId>
    <BikoretNo>9</BikoretNo>
    <DllVerNo>8.0.1.3</DllVerNo>
</root>

我已经安装了 Elastic Search 并且能够在 http://localhost:9200/ 上访问它。我已经安装了 Logstash 并创建了 logstash-xml.conf 来摄取上述 XML 文件。在 logstash-xml.conf 中,配置如下:

input
{
    file
        {
            path => "D:\data.xml"
            start_position => "beginning"
            sincedb_path => "NUL"
            exclude => "*.gz"
            type => "xml"
            codec => multiline {
                    pattern => "<?xml " 
                    negate => "true"
                    what => "previous"
                }
        }
}

filter {

    xml{
        source => "message"
        store_xml => false
        target => "root"
        xpath => [
            "/root/ChainId/text()", "ChainId",
            "/root/SubChainId/text()", "SubChainId",
            "/root/StoreId/text()", "StoreId",
            "/root/BikoretNo/text()", "BikoretNo",
            "/root/DllVerNo/text()", "DllVerNo"
        ]
    }

    mutate { 
        gsub => [ "message", "[\r\n]", "" ] 
    }
}

output{

elasticsearch{
        hosts => ["http://localhost:9200/"]
        index => "parse_xml"
    }

    stdout
    {
        codec => rubydebug
    }
}

当我在命令行中运行此配置时,我会在下面看到:

D:\logstash\bin>logstash -f logstash-xml.conf
"Using bundled JDK: ""
OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.jruby.ext.openssl.SecurityHelper (file:/C:/Users/CHEEWE~1.NGA/AppData/Local/Temp/jruby-3748/jruby14189572270520245744jopenssl.jar) to field java.security.MessageDigest.provider
WARNING: Please consider reporting this to the maintainers of org.jruby.ext.openssl.SecurityHelper
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Sending Logstash logs to D:/logstash/logs which is now configured via log4j2.properties
[2020-12-05T09:27:21,716][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"7.10.0", "jruby.version"=>"jruby 9.2.13.0 (2.5.7) 2020-08-03 9a89c94bcc OpenJDK 64-Bit Server VM 11.0.8+10 on 11.0.8+10 +indy +jit [mswin32-x86_64]"}
[2020-12-05T09:27:22,053][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2020-12-05T09:27:24,031][INFO ][org.reflections.Reflections] Reflections took 37 ms to scan 1 urls, producing 23 keys and 47 values
[2020-12-05T09:27:26,083][INFO ][logstash.outputs.elasticsearch][main] Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://localhost:9200/]}}
[2020-12-05T09:27:26,311][WARN ][logstash.outputs.elasticsearch][main] Restored connection to ES instance {:url=>"http://localhost:9200/"}
[2020-12-05T09:27:26,378][INFO ][logstash.outputs.elasticsearch][main] ES Output version determined {:es_version=>7}
[2020-12-05T09:27:26,383][WARN ][logstash.outputs.elasticsearch][main] Detected a 6.x and above cluster: the `type` event field won't be used to determine the document _type {:es_version=>7}
[2020-12-05T09:27:26,437][INFO ][logstash.outputs.elasticsearch][main] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["http://localhost:9200/"]}
[2020-12-05T09:27:26,487][INFO ][logstash.outputs.elasticsearch][main] Using a default mapping template {:es_version=>7, :ecs_compatibility=>:disabled}
[2020-12-05T09:27:26,621][INFO ][logstash.outputs.elasticsearch][main] Attempting to install template {:manage_template=>{"index_patterns"=>"logstash-*", "version"=>60001, "settings"=>{"index.refresh_interval"=>"5s", "number_of_shards"=>1}, "mappings"=>{"dynamic_templates"=>[{"message_field"=>{"path_match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false}}}, {"string_fields"=>{"match"=>"*", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false, "fields"=>{"keyword"=>{"type"=>"keyword", "ignore_above"=>256}}}}}], "properties"=>{"@timestamp"=>{"type"=>"date"}, "@version"=>{"type"=>"keyword"}, "geoip"=>{"dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"half_float"}, "longitude"=>{"type"=>"half_float"}}}}}}}
[2020-12-05T09:27:28,152][INFO ][logstash.javapipeline    ][main] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>8, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>1000, "pipeline.sources"=>["D:/logstash/bin/logstash-xml.conf"], :thread=>"#<Thread:0x65f57880 run>"}
[2020-12-05T09:27:29,176][INFO ][logstash.javapipeline    ][main] Pipeline Java execution initialization time {"seconds"=>1.02}
[2020-12-05T09:27:29,640][INFO ][logstash.javapipeline    ][main] Pipeline started {"pipeline.id"=>"main"}
[2020-12-05T09:27:29,712][INFO ][filewatch.observingtail  ][main][aca15cd3c6850472d105bd7b2b7a43da8ce8ec36a4b0b8c19830d898f1eb1109] START, creating Discoverer, Watch with file and sincedb collections
[2020-12-05T09:27:29,726][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[2020-12-05T09:27:30,020][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}

但是,回到 ElasticSearch,我可以创建 logstash 索引,但看不到加载的 XML 数据。

在此处输入图像描述

标签: elasticsearchlogstash

解决方案


推荐阅读