Solr charFilter不允许Regex查询[Solr 7.6.0]

mm5n2pyu  于 2022-10-21  发布在  Solr
关注(0)|答案(1)|浏览(205)

我正在尝试对solr solr.TextField字段运行正则表达式查询。这意味着在该字段类型上支持吗?
例如,我正在搜索返回>0结果的curl -g 'http://localhost:8983/solr/shard/select?rows=0&q=body:/hello/'
但当我切换到curl -g 'http://localhost:8983/solr/shard/select?rows=0&q=body:/h[aeiou]llo/'时,我得到的结果是0?

<fieldType name="body_text" class="solr.TextField" positionIncrementGap="100" multiValued="false">
    <analyzer>
      <charFilter class="solr.PatternReplaceCharFilterFactory" pattern="[^a-zA-Z0-9_@-]+" replacement=" "/>
      <tokenizer class="solr.WhitespaceTokenizerFactory" rule="java" />
      <filter class="solr.LengthFilterFactory" min="2" max="45"/>
      <filter class="solr.LowerCaseFilterFactory"/>
      <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
      <filter class="solr.SynonymGraphFilterFactory" expand="true" ignoreCase="true" synonyms="synonyms.txt"/>
    </analyzer>
</fieldType>

<field name="body" type="body_text" uninvertible="true" indexed="true" stored="false"/>

当我添加debugQuery=true时,我看到我的charFilter替换不允许正则表达式字符通过:

"debug":{
    "rawquerystring":"body:/h[aeiou]llo/",
    "querystring":"body:/h[aeiou]llo/",
    "parsedquery":"RegexpQuery(body:/h aeiou llo/)",
    "parsedquery_toString":"body:/h aeiou llo/",
    "explain":{},
    "QParser":"LuceneQParser",
yws3nbqq

yws3nbqq1#

PatterReplaceCharFilterFactory正在从正则表达式中删除与模式匹配的所有特殊字符。因此,“[”和“]”将从查询中删除,您将看到没有找到任何文档。查询h[aeiou]llo变为h aeiou llo
保持模式替换和正则表达式的一种方法是使用PatternReplaceFilterFactory。因此:

<fieldType name="body_text" class="solr.TextField" positionIncrementGap="100" multiValued="false">
    <analyzer>
      <tokenizer class="solr.WhitespaceTokenizerFactory" rule="java" />
      <filter class="solr.PatternReplaceFilterFactory" pattern="[^a-zA-Z0-9_@-]+" replacement=" "/>
      <filter class="solr.LengthFilterFactory" min="2" max="45"/>
      <filter class="solr.LowerCaseFilterFactory"/>
      <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
      <filter class="solr.SynonymGraphFilterFactory" expand="true" ignoreCase="true" synonyms="synonyms.txt"/>
    </analyzer>
</fieldType>

只要检查一下这是否适用于您的用例。

相关问题