首页 > 解决方案 > 带过滤器的 Elasticsearch 查询比不带过滤器的查询花费更多时间。为什么?

问题描述

我正在使用 Elasticsearch 版本(7.6.1)

使用过滤器查询是

GET mark13/_search
{
    "explain": false,
    "from": 0,
    "size": 500,
    "track_scores": true,
    "stored_fields": [
        "_source"
    ],
    "sort": {
        "_script": {
            "type": "number",
            "script": {
                "id": "sorting_algo",
                "params": {
                    "query": "abhinav keshri"
                }
            },
            "order": "desc"
        }
    },
    "script_fields": {
        "poca_score": {
            "script": {
                "id": "field",
                "params": {
                    "query": "abhinav keshri"
                }
            }
        }
    },

    "query": {
        "bool": {
            "filter": [
              {
                "term": {
                  "class" : "42"
                }
              }

            ], 
            "should": [
                {
                    "match": {
                        "applied_for": {
                            "query": "abhinav keshri",
                            "boost": 118,
                            "fuzziness": 0
                        }
                    }
                },
...
...
...

上述查询的输出是

{
  "took" : 45414,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 10000,
      "relation" : "gte"
    },
    "max_score" : 4730.905,
    "hits" : [
      {
...
...
...

执行时间为 45414 毫秒。

没有过滤器的查询是

GET mark13/_search
{
    "explain": false,
    "from": 0,
    "size": 500,
    "track_scores": true,
    "stored_fields": [
        "_source"
    ],
    "sort": {
        "_script": {
            "type": "number",
            "script": {
                "id": "sorting_algo",
                "params": {
                    "query": "abhinav keshri"
                }
            },
            "order": "desc"
        }
    },
    "script_fields": {
        "poca_score": {
            "script": {
                "id": "script",
                "params": {
                    "query": "abhinav keshri"
                }
            }
        }
    },

    "query": {
        "bool": {
            "should": [
                {
                    "match": {
                        "applied_for": {
                            "query": "abhinav keshri",
                            "boost": 118,
                            "fuzziness": 0
                        }
                    }
                },
...
...
...

上述查询的输出是

{
  "took" : 7104,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 9920,
      "relation" : "eq"
    },
    "max_score" : 4730.905,
    "hits" : [
      {

执行时间为 7104 毫秒。

我的期望是,与非过滤查询相比,过滤查询将花费更少的时间,因为应用布尔查询的结果更少。

我还尝试以不同的格式执行过滤器查询(上面给出的一种)-

我也尝试过以下格式。(它给了我相同的结果)。

{
    "from": 0,
    "size": 50,
    "track_scores": true,
    "stored_fields": [
        "_source"
    ],
    "sort": {
        "_script": {
            "type": "number",
            "script": {
                "id": "sorting_algo",
                "params": {
                    "query": "abhinav keshri"
                }
            }
        }
    },
    "script_fields": {
        "poca_score": {
            "script": {
                "id": "script",
                "params": {
                    "query": "abhinav keshri"
                }
            }
        }
    },
    "query": {
        "bool": {
            "should": [

            ],
            "filter": [
                {
                    "bool": {
                        "should": [
                            {
                                ...
                                ...
                            }
                        ],
                        "must": {
                            "bool": {
                                "should": [
                                    {
                                        "terms": {
                                            "class": [
                                                "42"
                                            ]
                                        }
                                    },
                                    ...
                                    ...

                                ]
                            }
                        }
                    }
                }
            ]
        }
    }
}

问题为什么过滤查询比非过滤查询花费更长的时间?

标签: elasticsearchelasticsearch-5elasticsearch-dsl

解决方案


推荐阅读