首页 > 解决方案 > 使用 CJKAnalyzer 进行 Apache Lucene 索引器搜索

问题描述

             I am using Apache lucene Indexer Search to search text, and I am using
 CJKAnalyzer. It search provided word by character, It means 
 If I Search for Japanese word "ぁxまn" , then its showing all 
 the words which is having any character of the provided Japanese word.
             But I dont want this I want search whole word or the 
 word which is having above mentioned word.

例如,如果我索引了 3 个单词。即“ぁxまn”、“ぁxま”、“まn”

 case 1 :  If I search for "ぁxまn" then it should only give one result.
 case 2 :  If I search for "ぁx" then it should give two result.

现在就我而言,如果我搜索“ぁxまn”这个词,那么它给出的三个结果是错误的。

-------------------- 索引代码 ----------------- ----

writer = getIndexWriter();
List<Document> documents = new ArrayList<>();
Document document1 = createDocument(1, "ぁxまn", "Richard");
writer.addDocument(document1);
writer.commit();



 private static Document createDocument(Integer id, String firstName,  String lastName)
{
    Document document = new Document();
    document.add(new StringField("id", id.toString() , Field.Store.YES));
    document.add(new TextField("firstName", firstName , Field.Store.YES));
    document.add(new TextField("lastName", lastName , Field.Store.YES));
    document.add(new TextField("website", website , Field.Store.YES));
    return document;
}


private static IndexWriter createWriter() throws IOException
{
    FSDirectory dir = FSDirectory.open(Paths.get(INDEX_DIR).toFile());
    IndexWriterConfig config = new                                         
    IndexWriterConfig(Version.LUCENE_44,new CJKAnalyzer());
    IndexWriter writer = new IndexWriter(dir, config);
    return writer;
}

--------调用搜索 ------

TopDocs foundDocs2 = searchByFirstName("*ぁxまn*", searcher);
-------------------------------------------------------------
private static TopDocs searchByFirstName(String firstName, IndexSearcher searcher) throws Exception
{

        MultiFieldQueryParser mqp = new MultiFieldQueryParser(new String[]{"firstName"}, new CJKAnalyzer());
        mqp.setAllowLeadingWildcard(true);
        Query q =mqp.parse(firstName);
        TopDocs hits = searcher.search(q, 10);
        return hits;
}

标签: javaapacheluceneanalyzerindexer

解决方案


推荐阅读