java - How to implment SpanQuery with MultiFieldQuery in java using lucene
问题描述
I currently want to implement SpanQuery with MultiFieldQuery for fuzzy phrasing but I have issues with it.
I have tried using MultiFieldQuery with BooleanQuery. It only works partially, it can search fuzzy phrase but the phrase does not follow any slop, for example, my index contains this "Check out these". When I search "Check out", it will return a hit and show this "Check out these". This is the result I want. However, when I search "Check these", it will also return a hit and show this "Check out these". In this case, it should fail because "out" is the second word.
I have also tried using SpanQuery. The above scenario will not happen if I using this method. However, I can only search for one field. Whereas I want to search with multiple fields.
private static TopDocs searchInFuzzyPhrase(String textToFind, String textToFind1, IndexSearcher searcher, int slop)
throws Exception {
// Create search query in phrase
Analyzer analyzer = new StandardAnalyzer();
//multifield
MultiFieldQueryParser query = new MultiFieldQueryParser(new String[]
{ "FULL_NAME", "BRAND_NAME", "DISPLAY_NAME", "DISPLAY_NAME_SYNONYM" }, analyzer);
query.setPhraseSlop(slop);
BooleanQuery bQuery = new BooleanQuery.Builder()
.add(query.parse(textToFind + "~"), BooleanClause.Occur.MUST)
.add(query.parse(textToFind1 + "~"), BooleanClause.Occur.MUST)
.build();
//span
SpanQuery[] clauses = new SpanQuery[2];
clauses[0] = new SpanMultiTermQueryWrapper(new FuzzyQuery(new Term("DISPLAY_NAME", textToFind)));
clauses[1] = new SpanMultiTermQueryWrapper(new FuzzyQuery(new Term("DISPLAY_NAME", textToFind1)));
SpanNearQuery sQuery = new SpanNearQuery(clauses, slop, true);
TopDocs hits = searcher.search(bQuery, 1);
return hits;
}
Using the example earlier. "Check out these" When I search "Check these" using MultiField + BooleanQuery, it will return a hit, however, it is not what I want.
When I search "Check these using SpanQuery, it will return a miss. This is what I want partially but it only applies to one field. I'm trying to apply it with many fields
解决方案
这里的问题是,跨度只适用于一个领域。这是可以理解的,因为不同领域之间几乎没有位置的概念。
您需要遵循您拥有的相同代码,只需将其扩展到您拥有的所有字段列表。
例如,对于列表中的每个字符串,您"FULL_NAME", "BRAND_NAME", "DISPLAY_NAME", "DISPLAY_NAME_SYNONYM"
需要SpanQuery
像在示例中那样创建,然后将它们全部合并为BooleanQuery
一个Occur.SHOULD
推荐阅读
- sql - Postgres:用左连接计算行数
- magento2 - magento 2 可配置产品缺货
- typescript - *indirect* 依赖的 TypeScript 错误,带有“找不到声明文件”
- php - 禁用 Woocommerce 中特定类别的购物车项目的其他产品类别
- c# - 调用公共函数时如何修复“变量已分配但其值从未使用”错误?
- r - 无法从 R Markdown 中的 Amelia 包中加载“自由贸易”数据
- ruby - 未初始化的常量 Syke::Core RubyGem
- python - 计算每个默认存储桶中有多少美元
- python - Python如何从postgres中的表中替换指定的单词/字符
- sql - SQL中按列分组联合查询