elasticsearch - 无法使用 Logstash XML 过滤器将 XML 文件摄取到 Elastic Search
问题描述
我有这个 XML 文件,我存储在 Window 10 的 D:\ 中:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<ChainId>7290027600007</ChainId>
<SubChainId>001</SubChainId>
<StoreId>001</StoreId>
<BikoretNo>9</BikoretNo>
<DllVerNo>8.0.1.3</DllVerNo>
</root>
我已经安装了 Elastic Search 并且能够在 http://localhost:9200/ 上访问它。我已经安装了 Logstash 并创建了 logstash-xml.conf 来摄取上述 XML 文件。在 logstash-xml.conf 中,配置如下:
input
{
file
{
path => "D:\data.xml"
start_position => "beginning"
sincedb_path => "NUL"
exclude => "*.gz"
type => "xml"
codec => multiline {
pattern => "<?xml "
negate => "true"
what => "previous"
}
}
}
filter {
xml{
source => "message"
store_xml => false
target => "root"
xpath => [
"/root/ChainId/text()", "ChainId",
"/root/SubChainId/text()", "SubChainId",
"/root/StoreId/text()", "StoreId",
"/root/BikoretNo/text()", "BikoretNo",
"/root/DllVerNo/text()", "DllVerNo"
]
}
mutate {
gsub => [ "message", "[\r\n]", "" ]
}
}
output{
elasticsearch{
hosts => ["http://localhost:9200/"]
index => "parse_xml"
}
stdout
{
codec => rubydebug
}
}
当我在命令行中运行此配置时,我会在下面看到:
D:\logstash\bin>logstash -f logstash-xml.conf
"Using bundled JDK: ""
OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.jruby.ext.openssl.SecurityHelper (file:/C:/Users/CHEEWE~1.NGA/AppData/Local/Temp/jruby-3748/jruby14189572270520245744jopenssl.jar) to field java.security.MessageDigest.provider
WARNING: Please consider reporting this to the maintainers of org.jruby.ext.openssl.SecurityHelper
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Sending Logstash logs to D:/logstash/logs which is now configured via log4j2.properties
[2020-12-05T09:27:21,716][INFO ][logstash.runner ] Starting Logstash {"logstash.version"=>"7.10.0", "jruby.version"=>"jruby 9.2.13.0 (2.5.7) 2020-08-03 9a89c94bcc OpenJDK 64-Bit Server VM 11.0.8+10 on 11.0.8+10 +indy +jit [mswin32-x86_64]"}
[2020-12-05T09:27:22,053][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2020-12-05T09:27:24,031][INFO ][org.reflections.Reflections] Reflections took 37 ms to scan 1 urls, producing 23 keys and 47 values
[2020-12-05T09:27:26,083][INFO ][logstash.outputs.elasticsearch][main] Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://localhost:9200/]}}
[2020-12-05T09:27:26,311][WARN ][logstash.outputs.elasticsearch][main] Restored connection to ES instance {:url=>"http://localhost:9200/"}
[2020-12-05T09:27:26,378][INFO ][logstash.outputs.elasticsearch][main] ES Output version determined {:es_version=>7}
[2020-12-05T09:27:26,383][WARN ][logstash.outputs.elasticsearch][main] Detected a 6.x and above cluster: the `type` event field won't be used to determine the document _type {:es_version=>7}
[2020-12-05T09:27:26,437][INFO ][logstash.outputs.elasticsearch][main] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["http://localhost:9200/"]}
[2020-12-05T09:27:26,487][INFO ][logstash.outputs.elasticsearch][main] Using a default mapping template {:es_version=>7, :ecs_compatibility=>:disabled}
[2020-12-05T09:27:26,621][INFO ][logstash.outputs.elasticsearch][main] Attempting to install template {:manage_template=>{"index_patterns"=>"logstash-*", "version"=>60001, "settings"=>{"index.refresh_interval"=>"5s", "number_of_shards"=>1}, "mappings"=>{"dynamic_templates"=>[{"message_field"=>{"path_match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false}}}, {"string_fields"=>{"match"=>"*", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false, "fields"=>{"keyword"=>{"type"=>"keyword", "ignore_above"=>256}}}}}], "properties"=>{"@timestamp"=>{"type"=>"date"}, "@version"=>{"type"=>"keyword"}, "geoip"=>{"dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"half_float"}, "longitude"=>{"type"=>"half_float"}}}}}}}
[2020-12-05T09:27:28,152][INFO ][logstash.javapipeline ][main] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>8, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>1000, "pipeline.sources"=>["D:/logstash/bin/logstash-xml.conf"], :thread=>"#<Thread:0x65f57880 run>"}
[2020-12-05T09:27:29,176][INFO ][logstash.javapipeline ][main] Pipeline Java execution initialization time {"seconds"=>1.02}
[2020-12-05T09:27:29,640][INFO ][logstash.javapipeline ][main] Pipeline started {"pipeline.id"=>"main"}
[2020-12-05T09:27:29,712][INFO ][filewatch.observingtail ][main][aca15cd3c6850472d105bd7b2b7a43da8ce8ec36a4b0b8c19830d898f1eb1109] START, creating Discoverer, Watch with file and sincedb collections
[2020-12-05T09:27:29,726][INFO ][logstash.agent ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[2020-12-05T09:27:30,020][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600}
但是,回到 ElasticSearch,我可以创建 logstash 索引,但看不到加载的 XML 数据。
解决方案
推荐阅读
- php - 用于在 for 循环中获取数据的 while 循环无法正常工作
- php - 在 Laravel 中提取超过 8M 条记录的数据库中的 ramdon 值
- java - 关于将纳秒转换为秒后显示的输出的问题
- docker - 如果 docker 主机是 debian10(buster),则运行旧的 debian 容器时出现分段错误
- perl - Perl Log::Dispatch:在运行中更改日志记录位置?
- python - 使用索引从验证集中获取错误分类的样本并形成数据框
- javascript - React HTML select element onChange函数,试图访问'event.target.value'
- c++ - 当我在 windows7 中安装程序时,我指定的字体大小不适用
- node.js - 测试时拦截axios请求
- javascript - Javascript 错误:未定义“文档”。[无非定义]