首页 > 解决方案 > Ruby 过滤器插件为单个输入 json 创建两条记录

问题描述

有两个 conf 文件用于将数据从 2 个 json 文件 testOrders 和 testItems 加载到同一个索引中,每个文件只包含一个文档。我正在尝试在两个文档之间创建父子关系。

下面是我的测试订单

  input{
         file{
        path => ["/path_data/testOrders.json"]
        type => "json"
        start_position => "beginning"
        sincedb_path => "/dev/null"
      }
    }

    filter {
      json {
        source => "message"
        target => "testorders_collection"
        remove_field => [ "message" ]
         }
           ruby {
        code => "
          event.set('[my_join_field][name]', 'testorders')
        "
      }
    }


    output { 
        elasticsearch { 
        hosts => ["localhost:9200"]
        index => "testorder"
        document_id => "%{[testorders_collection][eId]}"
        routing => "%{[testorders_collection][eId]}"
      }
    }

下面是 testItems 的配置文件

input{
     file{
    path => ["/path_to_data/testItems.json"]
    type => "json"
    start_position => "beginning"
    sincedb_path => "/dev/null"
  }
}

filter {
  json {
    source => "message"
    target => "test_collection"
    remove_field => [ "message" ]
  }
}
filter {
 ruby {
    code => "
      event.set('[my_join_field][name]', 'testItems')
      event.set('[my_join_field][parent]', event.get('[test_collection][foreignKeyId]'))
    "
  }
  }

output { 
    elasticsearch { 
    hosts => ["localhost:9200"]
    index => "testorder"
    document_id => "%{[test_collection][eId]}"
    routing => "%{[test_collection][foreignKeyId]}"
  }
}

正如预期的那样,logstash 为 testOrders 创建 1 条记录,但为 testItems 创建 2 条记录,给定 testOrders 和 testItems 各有 1 个 json 文档。一个文档是用数据正确创建的,但另一个文档是重复的,似乎没有数据。使用未解析的数据创建的文档如下所示

 {
        "_index": "testorder",
        "_type": "doc",
        "_id": "%{[test_collection][eId]}",
        "_score": 1,
        "_routing": "%{[test_collection][foreignKeyId]}",
        "_source": {
          "type": "json",
          "@timestamp": "2018-07-10T04:15:58.494Z",
          "host": "<hidden>",
          "test_collection": null,
          "my_join_field": {
            "name": "testItems",
            "parent": null
          },
          "path": "/path_to_data/testItems.json",
          "@version": "1"
        }

标签: rubyelasticsearchlogstashkibanaelastic-stack

解决方案


在弹性搜索中定义映射关系解决了这个问题。这是定义关系的方式

    PUT fulfillmentorder
{
  "mappings": {
    "doc": {
      "properties": {
        "my_join_field": { 
          "type": "join",
          "relations": {
            "fulfillmentorders": "orderlineitems" 
          }
        }
      }
    }
  }
}

推荐阅读