首页 > 解决方案 > Elasticsearch 按自定义项目重量排序

问题描述

我已经存储了包含状态属性的文档。我想按状态优先级(不是按字母顺序)对文档进行排序。我已经按照以前的答案编写了以下功能,但仍然无法按预期工作;文档按状态名称排序(按字母顺序):

function getESSortingByStatusQuery(query, order) {
        let statusOrder = ['BLUE', 'RED', 'BLACK', 'YELLOW', 'GREEN'];
        if(order == 'desc'){
            statusOrder.reverse();
        }
        const functions = statusOrder.map((item) => {
            const idx = statusOrder.indexOf(item);
            return {filter: {match: {statusColor: item}},
                weight: (idx + 1) * 50}
        });
        const queryModified = {
            "function_score": {
                "query": {"match_all": {}}, // this is for testing purposes and should be replaced with original query
                "boost": "5",
                "functions": functions,
                "score_mode": "multiply",
                "boost_mode": "replace"
            }
        }
        return queryModified;
    }

如果有人建议根据属性的预定义优先级(在本例中为状态)对项目进行排序的方法,我将不胜感激。

标签: javascriptelasticsearch

解决方案


以下是我认为您正在寻找的示例自定义排序脚本。我添加了示例映射、文档、查询和响应,就像它的显示方式一样。

映射:

PUT color_index
{
  "mappings": {
    "properties": {
      "color":{
        "type": "keyword"
      },
      "product":{
        "type": "text"
      }
    }
  }
}

样本文件:

POST color_index/_doc/1
{
  "color": "BLUE",
  "product": "adidas and nike"
}

POST color_index/_doc/2
{
  "color": "GREEN",
  "product": "adidas and nike and puma"
}

POST color_index/_doc/3
{
  "color": "GREEN",
  "product": "adidas and nike"
}

POST color_index/_doc/4
{
  "color": "RED",
  "product": "nike"
}

POST color_index/_doc/5
{
  "color": "RED",
  "product": "adidas and nike"
}

询问:

POST color_index/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "query_string": {
            "default_field": "*",
            "query": "adidas OR nike"
          }
        }
      ]
    }
  },
  "sort": [
    { "_score": { "order": "desc"} },          <---- First sort by score
    { "_script": {                             <---- Second sort by Colors
            "type": "number",
            "script": {
                "lang": "painless",
                "source": "if(params.scores.containsKey(doc['color'].value)) { return params.scores[doc['color'].value];} return 100000;",
                "params": {
                    "scores": {
                        "BLUE": 0,
                        "RED": 1,
                        "BLACK": 2,
                        "YELLOW": 3,
                        "GREEN": 4
                    }
                }
            },
            "order": "asc"
        }

    }
  ]
}

首先它将返回按其分数排序的文档,然后将第二个排序逻辑应用于该结果。

对于第二次排序,即使用脚本排序,请注意我是如何将数值添加到该scores部分的颜色中的。您需要相应地构建您的查询。

它如何工作的逻辑在source我认为是不言自明的部分中,我使用了doc['color'].value,因为那是我应用自定义排序逻辑的字段。

回复:

{
  "took" : 5,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 5,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [
      {
        "_index" : "color_index",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.5159407,
        "_source" : {
          "color" : "BLUE",
          "product" : "adidas and nike"
        },
        "sort" : [
          0.5159407,                     <--- This value is score(desc by nature)
          0.0                            <--- This value comes from script sort as its BLUE and I've used value 0 in the script which is in 'asc' order
        ]
      },
      {
        "_index" : "color_index",
        "_type" : "_doc",
        "_id" : "5",
        "_score" : 0.5159407,
        "_source" : {
          "color" : "RED",
          "product" : "adidas and nike"
        },
        "sort" : [
          0.5159407,
          1.0
        ]
      },
      {
        "_index" : "color_index",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 0.5159407,
        "_source" : {
          "color" : "GREEN",
          "product" : "adidas and nike"
        },
        "sort" : [
          0.5159407,
          4.0
        ]
      },
      {
        "_index" : "color_index",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 0.40538198,
        "_source" : {
          "color" : "GREEN",
          "product" : "adidas and nike and puma"
        },
        "sort" : [
          0.40538198,
          4.0
        ]
      },
      {
        "_index" : "color_index",
        "_type" : "_doc",
        "_id" : "4",
        "_score" : 0.10189847,
        "_source" : {
          "color" : "RED",
          "product" : "nike"
        },
        "sort" : [
          0.10189847,
          1.0
        ]
      }
    ]
  }
}

注意前三个文档,它具有精确的值product但不同color,你可以看到它们被分组在一起,因为我们首先排序,_score然后我们排序color

让我知道这是否有帮助!


推荐阅读