首页 > 解决方案 > ElasticSearch - 按术语和字段优先级查询文档

问题描述

我目前正在使用elasticsearch,并且正在尝试从Java后端实现一个查询,该查询不仅可以按术语而且还可以按字段优先级从我的索引中查询文档。在我的索引中,我的文档具有一个术语和一个指定类型的字段。

e.g 
term: "Flu Shot"
type: "procedure"

term: "Fluphenazine"
type: "drug"

我创建了一个按术语搜索的查询,弹性索引将返回与该术语匹配的最相关的结果。我要创建的功能是创建一个查询以返回匹配相同术语但按“类型”字段的优先级排序的结果。例如,当我输入“流感”时,我想先获取类型为“程序”的文档,然后再获取类型为“药物”的文档。目前,由于许多药物以“流感”开头,该索引仅返回类型为“药物”的文档。

标签: elasticsearchelastic-stack

解决方案


您可以使用function_score.

允许您修改由查询检索到的文档的function_score分数。要使用function_score,用户必须定义​​一个查询和一个或多个函数,为查询返回的每个文档计算一个新分数。

示例您的数据(使用 Elasticsearch 服务器 7.9):

  1. 创建索引并添加文档

     PUT /example_index
     {
       "mappings": {
         "properties": {
           "term": {"type": "text" },
           "type": {"type": "keyword"}
         }
       }
     }
    
     PUT /_bulk
     {"create": {"_index": "example_index", "_id": 1}}
     {"term": "Flu Shot", "type": "procedure"}
     {"create": {"_index": "example_index", "_id": 2}}
     {"term": "Fluphenazine", "type": "drug"}
     {"create": {"_index": "example_index", "_id": 3}}
     {"term": "Flu Shot2", "type": "procedure"}
     {"create": {"_index": "example_index", "_id": 4}}
     {"term": "Fluphenazine2", "type": "drug"}
    
  2. 使用自定义评分逻辑查询文档

     GET /example_index/_search
     {
       "query": {
         "function_score": {
           "query": {
             "wildcard": {
               "term": {
                 "value": "*flu*"
               }
             }
           },
           "functions": [
             {
               "filter": {
                 "term": {
                   "type": "procedure"
                 }
               },
               "weight": 2
             },
             {
               "filter": {
                 "term": {
                   "type": "drug"
                 }
               },
               "weight": 1
             }
           ]
         }
       }
     }
    
  3. 结果:

     {
       "took" : 2,
       "timed_out" : false,
       "_shards" : {
         "total" : 1,
         "successful" : 1,
         "skipped" : 0,
         "failed" : 0
       },
       "hits" : {
         "total" : {
           "value" : 4,
           "relation" : "eq"
         },
         "max_score" : 2.0,
         "hits" : [
           {
             "_index" : "example_index",
             "_type" : "_doc",
             "_id" : "1",
             "_score" : 2.0,
             "_source" : {
               "term" : "Flu Shot",
               "type" : "procedure"
             }
           },
           {
             "_index" : "example_index",
             "_type" : "_doc",
             "_id" : "3",
             "_score" : 2.0,
             "_source" : {
               "term" : "Flu Shot2",
               "type" : "procedure"
             }
           },
           {
             "_index" : "example_index",
             "_type" : "_doc",
             "_id" : "2",
             "_score" : 1.0,
             "_source" : {
               "term" : "Fluphenazine",
               "type" : "drug"
             }
           },
           {
             "_index" : "example_index",
             "_type" : "_doc",
             "_id" : "4",
             "_score" : 1.0,
             "_source" : {
               "term" : "Fluphenazine2",
               "type" : "drug"
             }
           }
         ]
       }
     }
    

您可以看到type设置为的文档比设置为procedure的文档具有更高的分数。这是因为我们为.typedrugtypefunction_score


推荐阅读