首页 > 解决方案 > Elasticsearch 聚合 - groupby 和过滤这些文档的最大值并再次分组

问题描述

我正在尝试向 elasticsearch(ES) 查询以使用示例数据之后的条件为我进行计算,

{user_id: 1, score: 10, status: passed, date: 2020-02-26},
{user_id: 1, score: 10, status: passed, date: 2020-02-25},
{user_id: 1, score: 8, status: passed, date: 2020-02-25},
{user_id: 1, score: 4, status: failed, date: 2020-02-27},
{user_id: 2, score: 5, status: passed, date: 2020-02-26},
{user_id: 2, score: 6, status: passed, date: 2020-02-25},
{user_id: 3, score: 1, status: failed, date: 2020-02-25},
{user_id: 3, score: 1, status: failed, date: 2020-02-25},
{user_id: 4, score: 7, status: passed, date: 2020-02-25}
{user_id: 4, score: 2, status: failed, date: 2020-02-26}
{user_id: 5, score: 0, status: failed, date: 2020-02-25}

我想获得每个 user_id 的最大分数(不是 es 的分数),以计算每个持续时间中的状态数(再次按状态分组)或步骤:按 user_id 分组获取最高分记录,并按这些状态分组

样本输出:

{
"passed":3
"failed":2
"date": 2020-02-25
},
{
"passed":2
"failed":1
"date": 2020-02-26
},
{
"passed":0
"failed":1
"date": 2020-02-27
}

甚至,我玩 SQL 仍然不容易解决,因为我无法获得row_number记录,导致结果可能是错误的(如果我错了,请纠正我)

用 SQL 就好了

// this nested query give a wrong answer, just to be an example
// I still playing with it 
select user_id, status from table t1 inner join (
select user_id, max(score) 
from table
where date = '2020-02-25'
group by user_id, max(score)
) table t2 on t1.user_id = t2.user_id and t1.score = t2.score 
group by t1.status

一个原因,我想这样做,因为我的工作将数据存储在 es 中,我认为如果 es 也可以这样做以减少后端的性能负载可能会很好。

提前谢谢你,所以欢迎任何建议

标签: mysqlelasticsearchelasticsearch-aggregation

解决方案


我使用术语聚合来获取日期和状态下的文档计数。请让我知道这对你有没有用。

样本映射

PUT testindex13
{
  "mappings": {
    "properties": {
      "userid": {
        "type": "keyword"
      },
      "score": {
        "type": "integer"
      },
      "status": {
        "type": "text",
        "fields": {
          "keyword": {
            "type": "keyword"
          }
        }
      },
      "date":{
        "type": "date",
        "format": "yyyy-MM-dd"
      }
    }
  }
}

数据:

"hits" : [
      {
        "_index" : "testindex13",
        "_type" : "_doc",
        "_id" : "9XtAhXABqQAB2FLojE2i",
        "_score" : 1.0,
        "_source" : {
          "user_id" : 1,
          "score" : 10,
          "status" : "passed",
          "date" : "2020-02-26"
        }
      },
      {
        "_index" : "testindex13",
        "_type" : "_doc",
        "_id" : "9ntAhXABqQAB2FLojU0l",
        "_score" : 1.0,
        "_source" : {
          "user_id" : 1,
          "score" : 10,
          "status" : "passed",
          "date" : "2020-02-25"
        }
      },
      {
        "_index" : "testindex13",
        "_type" : "_doc",
        "_id" : "93tAhXABqQAB2FLojk2P",
        "_score" : 1.0,
        "_source" : {
          "user_id" : 1,
          "score" : 8,
          "status" : "passed",
          "date" : "2020-02-25"
        }
      },
      {
        "_index" : "testindex13",
        "_type" : "_doc",
        "_id" : "-HtAhXABqQAB2FLoj02H",
        "_score" : 1.0,
        "_source" : {
          "user_id" : 1,
          "score" : 4,
          "status" : "failed",
          "date" : "2020-02-27"
        }
      },
      {
        "_index" : "testindex13",
        "_type" : "_doc",
        "_id" : "-XtAhXABqQAB2FLokE0m",
        "_score" : 1.0,
        "_source" : {
          "user_id" : 2,
          "score" : 5,
          "status" : "passed",
          "date" : "2020-02-26"
        }
      },
      {
        "_index" : "testindex13",
        "_type" : "_doc",
        "_id" : "-ntAhXABqQAB2FLokU2x",
        "_score" : 1.0,
        "_source" : {
          "user_id" : 2,
          "score" : 6,
          "status" : "passed",
          "date" : "2020-02-25"
        }
      },
      {
        "_index" : "testindex13",
        "_type" : "_doc",
        "_id" : "-3tEhXABqQAB2FLoS00O",
        "_score" : 1.0,
        "_source" : {
          "user_id" : 1,
          "score" : 10,
          "status" : "passed",
          "date" : "2020-02-26"
        }
      },
      {
        "_index" : "testindex13",
        "_type" : "_doc",
        "_id" : "_HtEhXABqQAB2FLoTE0W",
        "_score" : 1.0,
        "_source" : {
          "user_id" : 3,
          "score" : 1,
          "status" : "failed",
          "date" : "2020-02-25"
        }
      },
      {
        "_index" : "testindex13",
        "_type" : "_doc",
        "_id" : "_XtEhXABqQAB2FLoTE2L",
        "_score" : 1.0,
        "_source" : {
          "user_id" : 3,
          "score" : 1,
          "status" : "failed",
          "date" : "2020-02-25"
        }
      },
      {
        "_index" : "testindex13",
        "_type" : "_doc",
        "_id" : "_ntEhXABqQAB2FLoTU2P",
        "_score" : 1.0,
        "_source" : {
          "user_id" : 4,
          "score" : 7,
          "status" : "passed",
          "date" : "2020-02-25"
        }
      }
    ]

询问:

GET testindex13/_search
{
  "size": 0, 
  "aggs": {
    "Date": {
      "terms": {
        "field": "date"
      },
      "aggs": {
        "Status": {
          "terms": {
            "field": "status.keyword",
            "size": 10
          },
          "aggs": {
            "Count": {
              "value_count": {
                "field": "user_id"
              }
            }
          }
        }
      }
    }
  }
}

结果:

  "hits" : {
    "total" : {
      "value" : 12,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "Date" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : 1582588800000,
          "key_as_string" : "2020-02-25",
          "doc_count" : 7,
          "Status" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "passed",
                "doc_count" : 4,
                "Count" : {
                  "value" : 4
                }
              },
              {
                "key" : "failed",
                "doc_count" : 3,
                "Count" : {
                  "value" : 3
                }
              }
            ]
          }
        },
        {
          "key" : 1582675200000,
          "key_as_string" : "2020-02-26",
          "doc_count" : 4,
          "Status" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "passed",
                "doc_count" : 3,
                "Count" : {
                  "value" : 3
                }
              },
              {
                "key" : "failed",
                "doc_count" : 1,
                "Count" : {
                  "value" : 1
                }
              }
            ]
          }
        },
        {
          "key" : 1582761600000,
          "key_as_string" : "2020-02-27",
          "doc_count" : 1,
          "Status" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "failed",
                "doc_count" : 1,
                "Count" : {
                  "value" : 1
                }
              }
            ]
          }
        }
      ]
    }
  }
}

编辑 2. 使用排序按分数返回最高用户

GET testindex13/_search
{
  "size": 0,
  "aggs": {
    "Date": {
      "terms": {
        "field": "date"
      },
      "aggs": {
        "Status": {
          "terms": {
            "field": "status.keyword",
            "size": 10
          },
          "aggs": {
            "Users":{
              "terms": {
                "field": "user_id",
                "size": 10
              },
              "aggs": {
                "TopDocuments": {
                  "top_hits": {
                    "size": 1,
                    "sort": {"score":{"order" : "desc"}}
                  }
                }
              }
            },
            "Count": {
              "value_count": {
                "field": "user_id"
              }
            }
          }
        }
      }
    }
  }
}

结果:

 "aggregations" : {
    "Date" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : 1582588800000,
          "key_as_string" : "2020-02-25",
          "doc_count" : 7,
          "Status" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "passed",
                "doc_count" : 4,
                "Count" : {
                  "value" : 4
                },
                "Users" : {
                  "doc_count_error_upper_bound" : 0,
                  "sum_other_doc_count" : 0,
                  "buckets" : [
                    {
                      "key" : 1,
                      "doc_count" : 2,
                      "TopDocuments" : {
                        "hits" : {
                          "total" : {
                            "value" : 2,
                            "relation" : "eq"
                          },
                          "max_score" : null,
                          "hits" : [
                            {
                              "_index" : "testindex13",
                              "_type" : "_doc",
                              "_id" : "9ntAhXABqQAB2FLojU0l",
                              "_score" : null,
                              "_source" : {
                                "user_id" : 1,
                                "score" : 10,
                                "status" : "passed",
                                "date" : "2020-02-25"
                              },
                              "sort" : [
                                10
                              ]
                            }
                          ]
                        }
                      }
                    },
                    {
                      "key" : 2,
                      "doc_count" : 1,
                      "TopDocuments" : {
                        "hits" : {
                          "total" : {
                            "value" : 1,
                            "relation" : "eq"
                          },
                          "max_score" : null,
                          "hits" : [
                            {
                              "_index" : "testindex13",
                              "_type" : "_doc",
                              "_id" : "-ntAhXABqQAB2FLokU2x",
                              "_score" : null,
                              "_source" : {
                                "user_id" : 2,
                                "score" : 6,
                                "status" : "passed",
                                "date" : "2020-02-25"
                              },
                              "sort" : [
                                6
                              ]
                            }
                          ]
                        }
                      }
                    },
                    {
                      "key" : 4,
                      "doc_count" : 1,
                      "TopDocuments" : {
                        "hits" : {
                          "total" : {
                            "value" : 1,
                            "relation" : "eq"
                          },
                          "max_score" : null,
                          "hits" : [
                            {
                              "_index" : "testindex13",
                              "_type" : "_doc",
                              "_id" : "_ntEhXABqQAB2FLoTU2P",
                              "_score" : null,
                              "_source" : {
                                "user_id" : 4,
                                "score" : 7,
                                "status" : "passed",
                                "date" : "2020-02-25"
                              },
                              "sort" : [
                                7
                              ]
                            }
                          ]
                        }
                      }
                    }
                  ]
                }
              },
              {
                "key" : "failed",
                "doc_count" : 3,
                "Count" : {
                  "value" : 3
                },
                "Users" : {
                  "doc_count_error_upper_bound" : 0,
                  "sum_other_doc_count" : 0,
                  "buckets" : [
                    {
                      "key" : 3,
                      "doc_count" : 2,
                      "TopDocuments" : {
                        "hits" : {
                          "total" : {
                            "value" : 2,
                            "relation" : "eq"
                          },
                          "max_score" : null,
                          "hits" : [
                            {
                              "_index" : "testindex13",
                              "_type" : "_doc",
                              "_id" : "_HtEhXABqQAB2FLoTE0W",
                              "_score" : null,
                              "_source" : {
                                "user_id" : 3,
                                "score" : 1,
                                "status" : "failed",
                                "date" : "2020-02-25"
                              },
                              "sort" : [
                                1
                              ]
                            }
                          ]
                        }
                      }
                    },
                    {
                      "key" : 5,
                      "doc_count" : 1,
                      "TopDocuments" : {
                        "hits" : {
                          "total" : {
                            "value" : 1,
                            "relation" : "eq"
                          },
                          "max_score" : null,
                          "hits" : [
                            {
                              "_index" : "testindex13",
                              "_type" : "_doc",
                              "_id" : "AHtEhXABqQAB2FLoT04D",
                              "_score" : null,
                              "_source" : {
                                "user_id" : 5,
                                "score" : 0,
                                "status" : "failed",
                                "date" : "2020-02-25"
                              },
                              "sort" : [
                                0
                              ]
                            }
                          ]
                        }
                      }
                    }
                  ]
                }
              }
            ]
          }
        },
        {
          "key" : 1582675200000,
          "key_as_string" : "2020-02-26",
          "doc_count" : 4,
          "Status" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "passed",
                "doc_count" : 3,
                "Count" : {
                  "value" : 3
                },
                "Users" : {
                  "doc_count_error_upper_bound" : 0,
                  "sum_other_doc_count" : 0,
                  "buckets" : [
                    {
                      "key" : 1,
                      "doc_count" : 2,
                      "TopDocuments" : {
                        "hits" : {
                          "total" : {
                            "value" : 2,
                            "relation" : "eq"
                          },
                          "max_score" : null,
                          "hits" : [
                            {
                              "_index" : "testindex13",
                              "_type" : "_doc",
                              "_id" : "9XtAhXABqQAB2FLojE2i",
                              "_score" : null,
                              "_source" : {
                                "user_id" : 1,
                                "score" : 10,
                                "status" : "passed",
                                "date" : "2020-02-26"
                              },
                              "sort" : [
                                10
                              ]
                            }
                          ]
                        }
                      }
                    },
                    {
                      "key" : 2,
                      "doc_count" : 1,
                      "TopDocuments" : {
                        "hits" : {
                          "total" : {
                            "value" : 1,
                            "relation" : "eq"
                          },
                          "max_score" : null,
                          "hits" : [
                            {
                              "_index" : "testindex13",
                              "_type" : "_doc",
                              "_id" : "-XtAhXABqQAB2FLokE0m",
                              "_score" : null,
                              "_source" : {
                                "user_id" : 2,
                                "score" : 5,
                                "status" : "passed",
                                "date" : "2020-02-26"
                              },
                              "sort" : [
                                5
                              ]
                            }
                          ]
                        }
                      }
                    }
                  ]
                }
              },
              {
                "key" : "failed",
                "doc_count" : 1,
                "Count" : {
                  "value" : 1
                },
                "Users" : {
                  "doc_count_error_upper_bound" : 0,
                  "sum_other_doc_count" : 0,
                  "buckets" : [
                    {
                      "key" : 4,
                      "doc_count" : 1,
                      "TopDocuments" : {
                        "hits" : {
                          "total" : {
                            "value" : 1,
                            "relation" : "eq"
                          },
                          "max_score" : null,
                          "hits" : [
                            {
                              "_index" : "testindex13",
                              "_type" : "_doc",
                              "_id" : "_3tEhXABqQAB2FLoTU3v",
                              "_score" : null,
                              "_source" : {
                                "user_id" : 4,
                                "score" : 2,
                                "status" : "failed",
                                "date" : "2020-02-26"
                              },
                              "sort" : [
                                2
                              ]
                            }
                          ]
                        }
                      }
                    }
                  ]
                }
              }
            ]
          }
        },
        {
          "key" : 1582761600000,
          "key_as_string" : "2020-02-27",
          "doc_count" : 1,
          "Status" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "failed",
                "doc_count" : 1,
                "Count" : {
                  "value" : 1
                },
                "Users" : {
                  "doc_count_error_upper_bound" : 0,
                  "sum_other_doc_count" : 0,
                  "buckets" : [
                    {
                      "key" : 1,
                      "doc_count" : 1,
                      "TopDocuments" : {
                        "hits" : {
                          "total" : {
                            "value" : 1,
                            "relation" : "eq"
                          },
                          "max_score" : null,
                          "hits" : [
                            {
                              "_index" : "testindex13",
                              "_type" : "_doc",
                              "_id" : "-HtAhXABqQAB2FLoj02H",
                              "_score" : null,
                              "_source" : {
                                "user_id" : 1,
                                "score" : 4,
                                "status" : "failed",
                                "date" : "2020-02-27"
                              },
                              "sort" : [
                                4
                              ]
                            }
                          ]
                        }
                      }
                    }
                  ]
                }
              }
            ]
          }
        }
      ]
    }
  }

推荐阅读