elasticsearch - Elasticsearch search bool + 必须查询
问题描述
有人可以告诉我为什么这个 Elastic 查询会返回下面的结果。查询有 bool + must 部分,只有当字段 nn 与字符串“softo”完全匹配时才应该匹配。查询看起来像:
"query":{
"bool":{
"must":[
{"match":{"nn":"softo"}}
],
"should":[
{"match":{"nn":"sro"}},
{"match":{"nn":"as"}},
{"match":{"nn":"no"}},
{"match":{"nn":"vos"}},
{"match":{"nn":"ks"}}
]
}
}
它返回给我一个在 nn 字段中没有软的结果,例如:
{
"_index": "search_2",
"_type": "doc",
"_id": "17053188",
"_score": 129.76167,
"_source": {
"nn": "zo soz kovo zts nova as zts elektronika as",
"nazov": "ZO SOZ KOVO,ZŤS NOVA a.s.,ZTS ELEKTRONIKA a.s.",
}
},
{
"_index": "search_2",
"_type": "doc",
"_id": "45732078",
"_score": 126.953285,
"_source": {
"nn": "agentura socialnych sluzieb ass no",
"nazov": "Agentúra sociálnych služieb - ASS n.o.",
}
}
我不明白。为什么它返回像“zo soz kovo zts nova as zts elektronika as”这样的结果,其中没有“softo”字符串。nn 字段的映射如下所示:
{
"search_2": {
"aliases": {
"search": {}
},
"mappings": {
"doc": {
"dynamic": "strict",
"properties": {
"nn": {
"type": "text",
"boost": 10,
"analyzer": "autocomplete"
}
}
}
},
"settings": {
"index": {
"refresh_interval": "-1",
"number_of_shards": "4",
"provided_name": "search_2",
"creation_date": "1539693645683",
"analysis": {
"filter": {
"synonym_filter": {
"ignore_case": "true",
"type": "synonym",
"synonyms_path": "synonyms/sk_SK.txt"
},
"lemmagen_filter_sk": {
"type": "lemmagen",
"lexicon": "sk"
},
"stopwords_SK": {
"ignore_case": "true",
"type": "stop",
"stopwords_path": "stopwords/slovak.txt"
},
"remove_duplicities": {
"type": "unique",
"only_on_same_position": "true"
},
"autocomplete_filter": {
"type": "edge_ngram",
"min_gram": "2",
"max_gram": "20"
}
},
"analyzer": {
"autocomplete": {
"filter": [
"stopwords_SK",
"lowercase",
"stopwords_SK",
"autocomplete_filter"
],
"type": "custom",
"tokenizer": "standard"
},
"lower_ascii": {
"filter": [
"lowercase",
"asciifolding"
],
"type": "custom",
"tokenizer": "standard"
},
"suggestion": {
"filter": [
"stopwords_SK",
"lowercase",
"stopwords_SK",
"asciifolding"
],
"type": "custom",
"tokenizer": "standard"
}
}
},
"number_of_replicas": "1",
"uuid": "eyxXza0pQxWeQCpXih8ngg",
"version": {
"created": "6020399"
}
}
}
}
}
解决方案
由于在现场autocomplete
应用了分析仪,您获得这些结果的原因。nn
我将根据以下领域进行解释:
"nn": "zo soz kovo zts nova as zts elektronika as"
为上述生成的令牌将是:
zo, so, soz, ko, kov, kovo, zt, zts, no, nov, nova, as, zt, zts, el, ele, elek, elekt, elektr, elektro, elektro, elektroni, elektronik, elektronika, as
现在,默认情况下匹配查询将相同的分析器应用于搜索,并且标记之间的默认运算符是OR。所以{"match":{"nn":"softo"}}
实际上表现为
{
"match": {
"nn": "so OR sof OR soft OR softo"
}
}
正如您在字段中看到的那样,nn
生成的令牌之一是so
ans 因此它得到匹配。
推荐阅读
- swift - 我怎样才能在同一个地方有一个按钮和滚动视图?
- git - 需要帮助在我已经开始的项目上创建新存储库
- javascript - 如何在 swagger 中设置路由参数的数据类型?
- ruby-on-rails - 如何通过 React 前端连接到我的 Rails API 后端 ActionCable?
- php - 在 Laravel 中调用 null 上的成员函数角色()
- javascript - 如何替换ajax数据中的内容?
- caching - 浏览器对图像的缓存是否足以使服务器端存储的需求无效?
- angular - 地图显示过滤器后的rxjs 6过滤器不是功能
- android - Android 构建发布失败,原因是:java.lang.ArrayIndexOutOfBoundsException: 213(proguard 问题)
- javascript - Flatlist 不会在状态更改时重新呈现