首页 > 解决方案 > Searching for a term as both a single string and multi worded string

问题描述

I'm setting up my elastic instance in a schema-less manner (no up front mappings) and the application requires users be able to search against a field that contains a word that may or may not be tokenized into multiple strings. For example, the field may contain the word "ONETWO". The spec requires that a user should be able to search "ONETWO", "ONE", and "TWO" and retrieve that same document. There doesn't seem any easy way to accomplish this even with a custom tokenizer (and I don't think there SHOULD be an easy way to do this -- or any way at all). Just want to confirm my thoughts.

标签: elasticsearch

解决方案


使用使用n-gram 标记器的自定义分析器很容易满足您的要求,您甚至可以将其传递给小写标记过滤器,这样在您的情况下,即使您的文本也是如此,ONETWO但如果用户搜索one, OneONE他应该得到一个结果。尽管为此您需要应用不同的分析器搜索时间,但请阅读有关它的更多信息 https://www.elastic.co/guide/en/elasticsearch/reference/current/search-analyzer.html

有关更多信息,请参阅https://devticks.com/how-to-improve-your-full-text-search-in-elasticsearch-with-ngram-tokenizer-e346f29f8ddb,如果您需要任何信息,请告诉我。


推荐阅读