首页 > 解决方案 > 如何获得 PrefixQuery 找到的词的位置?

问题描述

我有一段文字:

force war WORD operation

force operation WORD WORD war
 
BookCase

我想找一个“书柜”。我用:

Analyzer customAnalyzer = CustomAnalyzer.builder()
   .withTokenizer("standard")
   .build();

IndexSearcher searcher = new IndexSearcher(reader);
Query query = new PrefixQuery(new Term("tags", "book")); // my query - "book*"

TopDocs hits = searcher.search(query, 10);

SimpleHTMLFormatter htmlFormatter = new SimpleHTMLFormatter("<b>", "</b>");
Highlighter highlighter = new Highlighter(htmlFormatter, new QueryScorer(query));
for (int i = 0; i < hits.scoreDocs.length; i++) {
    int id = hits.scoreDocs[i].doc;
    Document doc = searcher.doc(id);
    String text = doc.get("tags");
    TokenStream tokenStream = TokenSources.getAnyTokenStream(searcher.getIndexReader(), id, "tags", customAnalyzer);
    TextFragment[] frag = highlighter.getBestTextFragments(tokenStream, text, true, 100);
    
    for (int j = 0; j < frag.length; j++) {
        if ((frag[j] != null) && (frag[j].getScore() > 0)) {
            System.out.println((frag[j].toString()));
        }
    }
    System.out.println("finish test");

我的输出:

?war WORD force operation

war WORD operation WORD force

force war WORD operation

force operation WORD WORD war
 
<b>BookCase</b>

我找到了我的词“书柜”。如果我的,我如何获得“书柜”的位置

PrefixQuery(new Term("tags", "book"))

查询有str = "book"

我尝试:

String searchTerm = "Book".toLowerCase();
BytesRef ref = new BytesRef(searchTerm);
TermsEnum te = f.iterator();
PostingsEnum docsAndPosEnum = null;
if (te.seekExact(ref)) {
    System.out.println("Search");
    docsAndPosEnum = te.postings(docsAndPosEnum, PostingsEnum.ALL);
    int nextDoc = docsAndPosEnum.nextDoc();
    assert nextDoc != DocIdSetIterator.NO_MORE_DOCS;
    final int fr = docsAndPosEnum.freq();
    final int p = docsAndPosEnum.nextPosition();
    final int o = docsAndPosEnum.startOffset();

    System.out.println("Word: " + ref.utf8ToString());
    System.out.println("Position: " + p + ", startOffset: " + o + " length: " + ref.length + " Freg: " + fr);
    if (fr > 1) {
        for (int iter = 1; iter <= fr - 1; iter++) {
            System.out.println("Possition: " + docsAndPosEnum.nextPosition());
        }
    }
}

但是,它不适合。我使用 Lucene 7.4.0。

如何获得找到的术语“BookCase”的位置?

标签: javalucene

解决方案


推荐阅读