elasticsearch - 每个 @timestamp 小时的 ElasticSearch 查询聚合
问题描述
我在 metricbeat 的 elasticSearch 上进行查询,以每小时评估最常用的进程,在这些时刻,我正在汇总每个进程的开始时间和进程名称,我需要每小时使用字段“@timestamp”“划分”这些组
这是我的实际查询
GET metricbeat*/_search?
{"query": {
"bool": {
"must": [
{ "wildcard" : { "beat.hostname" : "ibmcx*" }},
{ "range": {
"@timestamp": {
"gte": "2019-03-22T00:00:00",
"lte": "2019-03-23T00:00:00"}}},
{"terms" : { "beat.hostname" : ["ibmcxapp101", "ibmcxapp102", "ibmcxapp103",
"ibmcxapp104", "ibmcxapp105", "ibmcxapp106", "ibmcxapp107",
"ibmcxapp108", "ibmcxapp109", "ibmcxapp110", "ibmcxapp111",
"ibmcxapp112", "ibmcxapp113", "ibmcxapp114", "ibmcxapp115",
"ibmcxapp116", "ibmcxapp117", "ibmcxapp118", "ibmcxapp119",
"ibmcxapp120", "ibmcxapp121", "ibmcxapp122", "ibmcxxaa100",
"ibmcxxaa101", "ibmcxxaa102", "ibmcxxaa103", "ibmcxxaa104",
"ibmcxxaa105", "ibmcxxaa106", "ibmcxxaa107", "ibmcxxaa108",
"ibmcxxaa109", "ibmcxxaa110", "ibmcxxaa111", "ibmcxxaa112",
"ibmcxxaa201", "ibmcxxaa202", "ibmcxxaa203", "ibmcxxaa204"
] }},
{"exists": {"field": "system.process.cmdline"}}
],
"must_not": [
{"term" : { "system.process.username" : "NT AUTHORITY\\SYSTEM" }},
{"term" : { "system.process.username" : "NT AUTHORITY\\NETWORK SERVICE" }},
{"term" : { "system.process.username" : "NT AUTHORITY\\LOCAL SERVICE" }},
{"term" : { "system.process.username" : "NT AUTHORITY\\Servicio de red"}},
{"term" : { "system.process.username" : "" }}
]
}
},
"size": 0,
"aggs": {
"group_by_start_time": {
"terms": {
"field": "system.process.cpu.start_time"
},
"aggs": {
"group_by_name": {
"terms": {
"field": "system.process.name.keyword"
}
}
}
}
},
"size": 0,
"sort" : [
{ "system.process.cpu.start_time" : {"order" : "asc"}},
{ "@timestamp" : {"order" : "asc"}},
{ "system.process.pid" : {"order" : "desc"}}
]}
解决方案
这有点难以遵循和重现——一个最小的例子(我认为整个query
不是真的需要)和示例文档会有很长的路要走。
如果您想进行每小时聚合,您需要做的第一件事就是聚合,然后在其中运行其他聚合。
每小时聚合的最小示例是:
POST /metricbeat*/_search?size=0
{
"aggs" : {
"metrics_per_hour" : {
"date_histogram" : {
"field" : "@timestamp",
"interval" : "hour"
}
}
}
}
在另一个聚合中折叠如下所示:
POST /metricbeat*/_search?size=0
{
"aggs" : {
"metrics_per_hour" : {
"date_histogram" : {
"field" : "@timestamp",
"interval" : "hour"
},
"aggs" : {
...
}
}
}
}
PS:如果您使用的是每日索引模式,您可以只使用正确的日期而不是通配符,然后跳过这部分查询:
"range": {
"@timestamp": {
"gte": "2019-03-22T00:00:00",
"lte": "2019-03-23T00:00:00"
}
}
推荐阅读
- wordpress - 从自定义分类中获取 ACF 值
- php - 如果我从 anathor 文件中包含 con,在哪里使 con 变量成为全局变量
- javascript - 在反应组件中使用变量作为全局变量的问题
- python - 在 macOS/OSX 上配置 flake8
- jquery - 模态内部的 ASP.NET MVC5 CRUD 验证不起作用
- c - 在 Z1 mote 上通过 Cooja 接收 UART 消息
- javascript - 如何将 Angular-Material Select 字段的显示值绑定到 ngModel
- json - How to include blank spaces " " in a hyperlink in JSON when formatting a SharePoint List Column
- php - Htaccess 分页和多个查询字符串
- optimization - 在有序数组上计算 GCD