lucene不区分大小写的排序搜索

prdp8dxp  于 2022-11-07  发布在  Lucene
关注(0)|答案(2)|浏览(250)

如何在不区分大小写的模式下按多字段排序进行搜索?
我使用的是lucene 4.10.4版本,并使用多字段排序进行排序,如

SortField[] sortFiled = new SortField[2];
sortFiled[0] = new SortField("name", SortField.Type.STRING);
sortFiled[1] = new SortField("country", SortField.Type.STRING);

TopDocs topDocs = indexSearcher.search(query, 10 , new Sort(sortFiled));

它给出了排序结果,但在区分大小写的模式。我希望它在不区分大小写的模式排序。

laik7k3q

laik7k3q1#

SortField[] sortFiled = new SortField[2];
sortFiled[0] = new SortField("name", SortField.Type.STRING);
sortFiled[1] = new SortField("country", CaseInsensitiveStringComparator());

在SortField中使用custome filedCompartorSource作为sortfield类型。在上面的代码中,我们在不区分大小写的模式下对国家/地区字段进行排序。请参阅下面的自定义FieldComparatorSource类

class CaseInsensitiveStringComparator extends FieldComparatorSource{

@Override
public FieldComparator<String> newComparator(String arg0, int arg1, int arg2,
        boolean arg3) throws IOException {
    return new CaseIgonreCompare(arg0, arg1);
}
}

class CaseIgonreCompare extends FieldComparator<String>{

private String field;
private String bottom;
private String topValue;
private BinaryDocValues cache;
private String[] values;

public CaseIgonreCompare(String field, int numHits) {
    this.field = field;
    this.values = new String[numHits];
}

@Override
public int compare(int arg0, int arg1) {
    return compareValues(values[arg0], values[arg1]);
}

@Override
public int compareBottom(int arg0) throws IOException {
    return compareValues(bottom, cache.get(arg0).utf8ToString());
}

@Override
public int compareTop(int arg0) throws IOException {
    return compareValues(topValue, cache.get(arg0).utf8ToString());
}

public int compareValues(String first, String second) {
    int val = first.length() - second.length();
    return val == 0 ? first.compareToIgnoreCase(second) : val;
};

@Override
public void copy(int arg0, int arg1) throws IOException {
   values[arg0] = cache.get(arg1).utf8ToString();
}

@Override
public void setBottom(int arg0) {
    this.bottom  = values[arg0];
}

@Override
public FieldComparator<String> setNextReader(AtomicReaderContext arg0)
        throws IOException {
    this.cache = FieldCache.DEFAULT.getTerms(arg0.reader(), 
            field  , true);
    return this;
}

@Override
public void setTopValue(String arg0) {
    this.topValue = arg0;
}

@Override
public String value(int arg0) {
    return values[arg0];
}

}

4nkexdtk

4nkexdtk2#

我需要按照冰岛字母表规则(aábcdeé ....)对字符串字段进行排序,所以我尝试将代码移植到C#中,并使用StringComparer.InvariantCultureIgnoreCase比较器,它运行得非常完美。
这是Birbal Singh代码的C#移植
CaseInsensitiveStringComparator.cs

public class CaseInsensitiveStringComparator : FieldComparerSource
{
    public override FieldComparer NewComparer(string fieldname, int numHits, int sortPos, bool reversed)
    {
        return new CaseIgonreCompare(fieldname, numHits);
    }
}

CaseIgonreCompare.cs

public class CaseIgonreCompare : FieldComparer<string>
{
    private string _field;
    private string[] _values;       
    private BinaryDocValues _cache;
    private string _bottom; 
    private string _topValue;

    public CaseIgonreCompare(string field, int numHits)
    {
        _field = field;
        _values = new string[numHits];
    }

    public override IComparable this[int slot] => _values[slot];

    public override int CompareValues(string first, string second)
    {
        int val = first.Length - second.Length;
        return StringComparer.InvariantCultureIgnoreCase.Compare(first, second);
    }

    private string GetValue(int doc)
    {
        var bytesRef = new BytesRef();
        _cache.Get(doc, bytesRef);
        return bytesRef.Utf8ToString();
    }

    public override int Compare(int slot1, int slot2)
    {
        return string.Compare(_values[slot1], _values[slot2]);
    }

    public override int CompareBottom(int doc)
    {
        return CompareValues(_bottom, GetValue(doc));
    }

    public override int CompareTop(int doc)
    {
        return CompareValues(_topValue, GetValue(doc));
    }

    public override void Copy(int slot, int doc)
    {
        _values[slot] = GetValue(doc);
    }

    public override void SetBottom(int slot)
    {
        _bottom = _values[slot];
    }

    public override FieldComparer SetNextReader(AtomicReaderContext context)
    {
        _cache = FieldCache.DEFAULT.GetTerms(context.AtomicReader, _field, true);

        return this;
    }

    public override void SetTopValue(object value)
    {
        _topValue = value as string;
    }
}

相关问题