首页 > 解决方案 > 将数据从 mysql 迁移到具有多对多关系的 elasticsearch

问题描述

我想将数据从 传输mysqlelasticsearch。为此,我使用logstashpipline

input {
  jdbc {
     . . .
    statement => "SELECT DISTINCT
    d.id AS doctorId,
    UNIX_TIMESTAMP( d.updated_at ) AS unix_ts_in_secs,
    d.firstName,
    d.lastName,
    m.id AS metaId m.`name` AS metaName 
FROM
    doctors AS d
    JOIN doctormetas AS dm ON d.id = dm.doctor
    JOIN metas AS m ON dm.meta = m.id 
WHERE
    UNIX_TIMESTAMP( d.updated_at )> : sql_last_value 
    AND UNIX_TIMESTAMP( d.updated_at )< NOW() 
ORDER BY
    d.updated_at ASC"
  }
}
filter {
  mutate {
    copy => { "id" => "[@metadata][_id]"}
    remove_field => ["@version", "unix_ts_in_secs"]
  }
}
output {
  elasticsearch {
      . . . 
  }
}

医生有很多医生元,医生元有一个元

在这种方法中,数据重复多次,占用大量空间

{
    "_index" : "doctor",
    "_type" : "_doc",
    "_id" : "411258222",
    "_score" : 1.0,
    "_source" : {
        "meta" : 222,
        "doctorId": 411258,
        "firstname" : "محسن",
        "lastname" : "بهمنی کشکولی",
    }
},
{
    "_index" : "doctor",
    "_type" : "_doc",
    "_id" : "411258438",
    "_score" : 1.0,
    "_source" : {
        "meta" : 438,
        "doctorId": 411258,
        "firstname" : "محسن",
        "lastname" : "بهمنی کشکولی",
    }
},
{
    "_index" : "doctor",
    "_type" : "_doc",
    "_id" : "411258456",
    "_score" : 1.0,
    "_source" : {
        "meta" : 456,
        "doctorId": 411258,
        "firstname" : "محسن",
        "lastname" : "بهمنی کشکولی",
    }
},

输出结果如何过滤如下?

{
    "_index" : "doctor",
    "_type" : "_doc",
    "_id" : "411258",
    "_score" : 1.0,
    "_source" : {
        "meta" : [438,222,456],
        "doctorId": 411258,
        "firstname" : "محسن",
        "lastname" : "بهمنی کشکولی",
    }
},

标签: mysqlelasticsearchmigrationlogstashone-to-many

解决方案


推荐阅读