elasticsearch - 如何在 date_histogram 聚合下对当天每个唯一 ID 的 Lastest Record 进行 TopHit 聚合?
问题描述
我有一组“10K id”和一个类似的文档类型(在实际情况下,我的文档数量为 500K,所以我将其简化如下)
{"id":"Peter","sales":12679, "time": "timestamp": "2021-04-22 13:03:46.972"}
{"id":"Peter","sales":12375, "time": "timestamp": "2021-04-21 13:03:46.972"}
{"id":"Peter","sales":32124, "time": "timestamp": "2021-04-20 17:03:46.972"}
{"id":"Peter","sales":12472, "time": "timestamp": "2021-04-20 13:03:46.972"}
{"id":"Peter","sales":42679, "time": "timestamp": "2021-04-18 14:03:46.972"}
{"id":"Peter","sales":12379, "time": "timestamp": "2021-04-18 13:03:46.972"}
....
{"id":"John","sales":2256679, "time": "timestamp": "2021-04-2 13:03:46.972"}
{"id":"John","sales" 752375, "time": "timestamp": "2021-04-1 13:03:46.972"}
{"id":"John","sales":85124, "time": "timestamp": "2021-04-10 17:03:46.972"}
{"id":"John","sales":1472, "time": "timestamp": "2021-04-10 13:03:46.972"}
{"id":"John","sales":4279, "time": "timestamp": "2021-04-18 14:03:46.972"}
{"id":"John","sales":2379, "time": "timestamp": "2021-04-18 13:03:46.972"}
....
我想做一个查询来执行以下任务:
- 查找每天“每个id”的最新记录并且
- 计算每个“id”DIDNT 在任何一天有多少次“doc_count”
使用 Date Histogram + Top Hit agg + Uniquness 查找每个“ID”当天的最新销售,并且还检查其中任何一个 DIDNT 在任何一天都有 DOC 计数,我尝试了很多查询,但没有一个返回我想要的结果,
如下所示:
{
"size": 0,
"sort": {"timestamp": "desc"},
"query": {
"bool": {
"must":
{
"terms": {
"id": ["Peter","John"]
}
}
}
},
"aggs": {
"sales_over_time": {
"date_histogram": {
"field": "timestamp",
"calendar_interval": "1d"
},
"aggs": {
"id": {
"terms": {
"field": "id.keyword"
}
}
}
}
}
}
返回类似的东西
{
"key": 1615852800000,
"doc_count": 6,
"id": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "624232532",
"doc_count": 4
},
{
"key": "656625970",
"doc_count": 2
}
]
}
},
在此查询之后,我仍然需要检查其中一些条件是否在当天没有 doc_count 出现,
为了让对方获得每个 iD 当天的最新记录,我尝试,
{
"size": 0,
"sort": {"timestamp": "desc"},
"query": {
"bool": {
"must":
{
"terms": {
"oneNetDevieId": [656625970,624232532,624232499]
}
}
}
},
"aggs": {
"sales_over_time": {
"date_histogram": {
"field": "timestamp",
"calendar_interval": "1d"
},
"aggs": {
"name": {
"terms": { "field": "oneNetDevieId.keyword" },
"aggs": {
"latest_comment": {
"top_hits": {
"sort": [ {"timestamp": { "order": "desc" } } ],
"size": 1
}
}
}
}
}
}
}
}
我怎样才能对每天的每个 ID 进行求和?
解决方案
推荐阅读
- android - 我希望我的按钮被单击一次,我想禁用双击
- php - 我想显示当前时间不在(column)start_time和end_time(column)之间的所有人员记录
- dart - 无法在 onDismissible 中删除
- xml - 将 XML 模式元素引用到与值相同的 XML 模式元素中
- nim-lang - 如何以编程方式获取 nim 编译器版本?
- csv - 当分隔符是列值的一部分时,在 Unix 中计算文件的列数
- elm - 如何使用 elm reactor 通过 http 请求访问文件?
- javascript - 使用基本编码和 iframe 在页面中显示 Excel 文件
- python - OpenCV:去噪对象跟踪
- javascript - aws javascript sdk 函数用于修改 s3 中文件上的现有标签