elasticsearch - Elasticsearch 高 CPU 使用率和查询响应时间
问题描述
我一直在构建基于 Elasticsearch Service 的搜索功能。我使用的索引映射是:
curl -X PUT "localhost:9200/cast_tag_added" -H 'Content-Type: application/json' -d'
{
"settings": {
"analysis": {
"analyzer": {
"analyzer_title": {
"tokenizer": "tokenizer_title",
"filter": [
"lowercase",
"asciifolding",
"trim"
]
},
"tokenizer": {
"tokenizer_title": {
"type": "edge_ngram",
"min_gram": 2,
"max_gram": 20,
"token_chars": [
"letter",
"digit"
]
}
}
}
},
"mappings": {
"_doc": {
"properties": {
"name": {
"type": "text",
"index": "true",
"analyzer": "analyzer_title",
"search_analyzer": "analyzer_search"
},
"id": {
"type": "long",
"index":"false"
},
"ratings": {
"type" : "double",
"index" : "false"
},
}
}
}
}
}
存储的样本数据为:
{
"name": "Die Hard"
"id": 12345
}
示例查询:
{
"query": {
"function_score": {
"query": {
"bool": {
"should": [{
"match": {
"name": {
"query": "die",
"fuzziness": "AUTO",
"operator": "and",
"boost": 2
}
}
}, {
"match_phrase": {
"name": {
"query": "die",
"boost": 4
}
}
}
},
"field_value_factor": {
"field": "ratings",
"modifier": "log1p",
"missing": 1
}
}
},
"explain" : true
}
}
但是在测试时,我发现查询吞吐量太少(大约 24 秒),而且查询响应延迟太长(平均大约 8 秒)。我添加了这些设置以获取慢速日志查询:
{
"index.search.slowlog.threshold.query.warn": "0ms",
"index.search.slowlog.threshold.query.info": "0ms",
"index.search.slowlog.threshold.query.debug": "0ms",
"index.search.slowlog.threshold.query.trace": "0ms",
"index.search.slowlog.threshold.fetch.warn": "0ms",
"index.search.slowlog.threshold.fetch.info": "0ms",
"index.search.slowlog.threshold.fetch.debug": "0ms",
"index.search.slowlog.threshold.fetch.trace": "0ms"
}
这给了我所有正在运行的查询的响应时间。我在my-application_index_search_slowlog.log文件中看到的每个查询的响应时间都非常短。对于所有查询,它的范围在几微秒到 1 或 2 毫秒之间。例子:
[2019-07-16T05:24:06,264][WARN ][index.search.slowlog.query] [node-1] [cast_tag_added][1] took[892micros], took_millis[0], total_hits[676], types[], stats[], search_type[QUERY_THEN_FETCH], total_shards[5], source[<the search query goes here>], id[],
即使日志中显示的响应时间较短,也无法弄清楚为什么查询响应延迟如此之高。还注意到在 elasticsearch 服务器上测试开始几秒钟后,CPU 利用率会飙升至 100%。
解决方案
推荐阅读
- rest - gRPC 与 NATS 或 Kafka 是否有意义?
- python - 将 ONNX 模型转换为 TensorFlow Lite
- python - 通过views.py创建新模型实例,包括通过url的args
- html - 使用 python 进行 Web 抓取 - 未下载动态表数据
- python - Pandas 根据从第二个到第一个的匹配列值将列值从一个 DF 映射到另一个
- c# - .net Core Grpc 客户端无法调用 Greeter 服务
- python - 如何通过在 selenium (Python) 中按 TAB 来获取元素?
- angular - Angular 6 Event.Prevent 单选按钮的默认值
- python - OpenCV 调整大小错误(-215:断言失败)
- angular - Angular - 如何在一个请求中一次性订阅数据,并在子组件中使用它?