我已经创建了一个小的Lucene/Luwak原型,我在Lucene语法中添加了一个查询,在它之后我想提供一个InputDocument,它应该会给予我一个与该查询匹配的结果。
对于TextFields,一切似乎都在工作。然而,当我试图对Numbers / DoublePoint做同样的事情时,我从来没有得到匹配(对于Not查询/反向搜索)。
如果我使用文本值,则它是有效的:
storeRuleQuery("ruleID_1" , "textA:* -textA:A");
textValues.put("textA" , "B");
And in console: Match in Luwak: ruleID_1:textA:* -textA:A
VS系列
storeRuleQuery("ruleID_1" , "numberA:* -numberA:500");
numberValues.put("numberA" , 900d);
And in console: No Match
让我来解释一下我使用的代码:
首先,我为我的显示器创建一个RamDirectory:
fsDirectory = new RAMDirectory();
我还定义了一个字段类型:
private static final FieldType FIELD_TYPE = new FieldType();
FIELD_TYPE.setStored(false);
FIELD_TYPE.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS);
然后我创建我的监视器:
QueryIndexConfiguration config = new QueryIndexConfiguration();
config.storeQueries(true);
monitor = new Monitor(new LuwakQueryParser(null, new KeywordAnalyzer(), number, text), new TermFilteredPresearcher(), fsDirectory, config);
为了使用DoublePoints,我创建了自己的QueryParser(LuwakQueryParser)
public class LuwakQueryParser implements MonitorQueryParser {
private QueryParser parser = null;
/**
* Creates a parser with a given default field and analyzer
* @param defaultField the default field
* @param analyzer an analyzer to use to analyzer query terms
*/
public LuwakQueryParser(String defaultField, Analyzer analyzer, List<String> numbers, List<String> text) {
this.parser = new RangeQueryParser(defaultField, analyzer, numbers, text);
this.parser.setLowercaseExpandedTerms(false);
this.parser.setAllowLeadingWildcard(true);
this.parser.setDefaultOperator(Operator.OR);
}
@Override
public Query parse(String query, Map<String, String> metadata) throws Exception {
return parser.parse(query);
}
}
正如您所看到的,我使用了一个自定义的RangeQueryParser,然后用它来解析查询
public class RangeQueryParser extends QueryParser {
private final List<String> numbers;
private final List<String> text;
public RangeQueryParser(String f, Analyzer a, List<String> numbers, List<String> text) {
super(f, a);
this.numbers = numbers;
this.text = text;
}
@Override
protected Query newFieldQuery(Analyzer analyzer, String field, String queryText, boolean quoted) throws ParseException {
if (StringUtils.isNotBlank(queryText) && isNumber(field) && NumberUtils.isNumber(queryText)) {
//needed for single value, transforms it to a rage (eg [500 TO 500])
return (DoublePoint.newExactQuery(field, Double.parseDouble(queryText)));
} else if(isText(field)){
return (super.newFieldQuery(analyzer, field, queryText, quoted));
}
return (super.newFieldQuery(analyzer, field, queryText, quoted));
}
我已经删除了本例中当前不需要的未使用的代码
正如您所看到的,newFieldQuery方法检查它是文本值还是数字值,并调整查询。文本将存储为普通的fieldQuery,而数字将转换为DoublePoint. newExactQuery。例如,它将“numberA:500”转换为“numberA:[500 to 500]”
然后,我向监视器添加一个查询:
//input: storeRuleQuery("ruleID_1" , "numberA:* -numberA:500");
public void storeRuleQuery(String ruleID, String query) throws IOException, UpdateException {
String queryString = query;
if (queryString.trim().length() > 0) {
MonitorQuery monitorQuery = new MonitorQuery(ruleID, queryString);
monitor.deleteById(ruleID);
monitor.update(monitorQuery);
}
}
这是由monitor.update()方法调用创建的BooleanQuery:
然后,我想通过提供一个InputDocument来匹配ruleID_1,如下所示:
Map<String, Double> numberValues = new HashMap<>();
Map<String, String> textValues = new HashMap<>();
numberValues.put("numberA" , 900d);
InputDocument.Builder builder = InputDocument.builder("document_1");
for(String numberField : numberValues.keySet()){
builder.addField(new DoublePoint(numberField, (numberValues.get(numberField))));
}
for(String textField : textValues.keySet()){
builder.addField(new Field(textField, (textValues.get(textField)), FIELD_TYPE));
}
List<InputDocument> documents = new ArrayList() {{
add(builder.build());
}};
DocumentBatch batch = DocumentBatch.of(documents);
Matches<HighlightsMatch> matches;
matches = monitor.match(batch, HighlightingMatcher.FACTORY);
这是从这个输入文档和我们的matcher.match()创建的布尔查询:
第一页第二页
然后我检索匹配项(在本例中,我得到0个匹配项):
Set<Map<String, String>> matchingIds = new HashSet<>();
for (DocumentMatches<HighlightsMatch> docMatches : matches) {
for (HighlightsMatch match : docMatches) {
MonitorQuery mq = monitor.getQuery(match.getQueryId());
HashMap<String, String> q = new HashMap<>();
q.put(match.getQueryId(), mq.getQuery());
matchingIds.add(q);
}
}
Map<String, String> results = new HashMap<>();
for (Map<String, String> v : matchingIds) {
results.put(v.keySet().iterator().next(), v.values().iterator().next());
}
for(String key : results.keySet()){
System.out.println("Match in Luwak: " + key + ":" + results.get(key));
}
我使用的luwak版本:
<dependency>
<groupId>com.github.flaxsearch</groupId>
<artifactId>luwak</artifactId>
<version>1.5.0</version>
</dependency>
1条答案
按热度按时间5lhxktic1#
简短回答:通配符只能匹配字符;不能用于数字。我现在要将数字通配符表示为一个范围,如numberA:[-Double.MAX_VALUE TO Double.MAX_VALUE]