elasticsearch - Elasticsearch 完成字段不返回关于使用 _analyze api 响应中返回的令牌进行搜索的建议
问题描述
我试图用弹性搜索完成字段建议器实现自动完成功能。
Step1:创建一个test_index:
curl --location --request PUT 'http://localhost:9200/test_index?pretty' \
--header 'Content-Type: application/json' \
--data-raw '{"settings": {"number_of_shards": 1, "max_ngram_diff": 7, "number_of_replicas": "0", "analysis": {"filter": {"edge_ngram_completion_filter": {"token_chars": ["whitespace", "digit"], "min_gram": "3", "type": "edge_ngram", "max_gram": "10"}}, "analyzer": {"edge_ngram_completion": {"filter": ["lowercase", "edge_ngram_completion_filter"], "type": "custom", "tokenizer": "standard"}}}}, "mappings": {"properties": {"id": {"type": "integer"}, "name": {"type": "text", "fields": {"raw": {"type": "keyword"}, "suggest": {"type": "completion", "analyzer": "edge_ngram_completion", "search_analyzer": "simple", "preserve_separators": true, "preserve_position_increments": true, "max_input_length": 100}}}}}}
'
Step2:索引以下文档
curl --location --request POST 'http://localhost:9200/test_index/_doc?pretty' \
--header 'Content-Type: application/json' \
--data-raw '{
"name": "PANTOCID DSR CAP",
"id": 1
}'
第3步:在点击“PANTOCID DSR CAP”的分析api时,我得到[“pan”,“pant”,“panto”,“pantoc”,“pantoci”,“pantocid”,“dsr”,“cap”]令牌
curl --location --request POST 'http://localhost:9200/test_index/_analyze?pretty' \
--header 'Content-Type: application/json' \
--data-raw '{
"analyzer" : "edge_ngram_completion",
"text" : "PANTOCID DSR CAP"
}
'
第 4 步:但是当我使用“dsr”进行搜索时,我没有收到任何建议:
curl --location --request POST 'http://localhost:9200/test_index/_search?pretty' \
--header 'Content-Type: application/json' \
--data-raw '{
"suggest": {
"egde_ngram_suggest" : {
"text": "dsr",
"completion" : {
"field" : "name.suggest"
}
}
}
}
'
这是为什么?我的意思是,如果搜索到的文本是生成的标记之一,那么它必须导致建议匹配,对吗?我在这里错过了什么吗?
任何帮助表示赞赏。提前致谢。
解决方案
What may be confusing is the _analyze
step. While you did declare the correct analyzer, try to verify that field's tokenization by specifically requesting that field:
curl --location --request POST 'http://localhost:9200/test_index/_analyze?pretty' \
--header 'Content-Type: application/json' \
--data-raw '{
"field" : "name.suggest", <---
"text" : "PANTOCID DSR CAP"
}
'
When you run that, you'll see that the text was n-grammed from the very beginning:
pandsrcap
pant dsr cap
...
and none of these token variations would start w/ dsr
and ditch the pan
prefix.
What this tells us is that the completion field works properly -- it's meant for autocomplete implementations, not for middle-of-the-text searches like you seem to aim for.
推荐阅读
- dompdf - DOMPDF @page:第一个选择器不起作用
- uwp - 如何允许从 UWP 应用访问 win32 进程的 RPC 端点
- google-maps-api-3 - 使用 chrome 时如何在谷歌地图上制作多行标签?
- java - 当按下一个按钮时,它卡在了无限循环中
- git - 错误:克隆远程 repo 'origin' 时出错 - Ubuntu 机器
- javascript - JS,我想制作一个在整个屏幕上打开导航的按钮
- android - 没有标签的高密度文本字段,高度为 40dp
- monads - 在 Idris 的 ST 中使用 Functor/Applicative/Monad 构造的惯用方式
- android - 谷歌地图是否重叠 locationListener 方法?
- angular - Angular将JSON转换为模型并存储在数组中