Lucene建议(SUGGEST_MORE_POPULAR标志的行为)

lxkprmvk  于 12个月前  发布在  Lucene
关注(0)|答案(1)|浏览(183)

我想使用Lucene的建议机制来帮助最终用户找出他什么时候做了一个错字。
Lucene的SpellChecker有一个方法suggestSimilar,它应该接收一个SuggestionMode标志。使用标志SuggestMode.SUGGEST_MORE_POPULAR,我希望只对当前目录中更多的word有建议。
下面的代码似乎不同意这个假设:

import org.apache.lucene.analysis.core.SimpleAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.TextField;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.search.spell.LuceneDictionary;
import org.apache.lucene.search.spell.SpellChecker;
import org.apache.lucene.search.spell.SuggestMode;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.RAMDirectory;

import java.io.IOException;
import java.util.LinkedList;
import java.util.List;

public class SuggestTest {

    static public void main(String args[]) throws IOException {

        final String NAME_FIELD = "NAME";

        Directory directory = new RAMDirectory();
        IndexWriter writer = new IndexWriter(directory,
                new IndexWriterConfig(new SimpleAnalyzer()));
        writer.deleteAll();
        writer.commit();

        List<String> list = new LinkedList<>();

        for (int i = 0; i < 1000; i++)
            list.add("wafa");

        list.add("waffa");

        for (String name : list) {
            Document doc = new Document();
            doc.add(new TextField(NAME_FIELD, name, Field.Store.YES));
            writer.addDocument(doc);
        }

        writer.close();
        DirectoryReader directoryReader = DirectoryReader.open(directory);

        LuceneDictionary nameDictionary = new LuceneDictionary(directoryReader, NAME_FIELD);

        IndexWriterConfig config = new IndexWriterConfig(new SimpleAnalyzer());

        SpellChecker spellChecker = new SpellChecker(directory);
        spellChecker.indexDictionary(nameDictionary, config, true);

        for (String s : new String[]{"wafa", "waffa", "wala"}) {
            String suggestions[] = spellChecker.suggestSimilar(s, 10, null, null, SuggestMode.SUGGEST_MORE_POPULAR);
            System.out.println("Suggestions for " + s);
            for (String suggestion : suggestions)
                System.out.println(" -" + suggestion);
        }
    }
}

字符串
当我在寻找Wafa(目录中有1000个占位符!)时,我不希望下面的代码提示我Waffa

yvgpqqbh

yvgpqqbh1#

您需要调整代码以使用SUGGEST_MORE_POPULAR模式。

String suggestions[] = spellChecker.suggestSimilar(s, 10, directoryReader, NAME_FIELD, SuggestMode.SUGGEST_MORE_POPULAR);

字符串

相关问题