如果我在索引设置中使用keyword_repeat过滤器,那么当使用should通过bool查询搜索文档时,只搜索匹配条件的第一个字段。Elasticsearch版本:8.7.1
创建索引
curl -X PUT "elasticsearch:9200/my-test-index?pretty" -H 'Content-Type: application/json' -d'
{
"settings": {
"analysis": {
"analyzer": {
"default": {
"tokenizer": "default_tokenizer",
"filter": [
"lowercase",
"keyword_repeat",
"default_stemmer"
]
}
},
"tokenizer": {
"default_tokenizer": {
"type": "standard"
}
},
"filter": {
"default_stemmer": {
"type": "stemmer",
"language": "english"
},
"unique_stem": {
"type": "unique",
"only_on_same_position": true
}
}
}
},
"mappings": {
"properties": {
"field1": {
"type": "text"
},
"field2": {
"type": "text"
}
}
}
}
'
添加文档
curl -X POST "elasticsearch:9200/my-test-index/_doc/1?pretty" -H 'Content-Type: application/json' -d'
{
"field1": "running man",
"field2": "other text"
}
'
搜索文档
curl -X GET "elasticsearch:9200/my-test-index/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"should": [
{ "match": { "field2": "running" }},
{ "match": { "field1": "running" }}
]
}
}
}
'
回复:
{
"took" : 243,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
}
}
我希望文件能被找到。
但是具有不同字段顺序的请求(field 1,field 2)
curl -X GET "elasticsearch:9200/my-test-index/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"should": [
{ "match": { "field1": "running" }},
{ "match": { "field2": "running" }}
]
}
}
}
'
查找文档
{
"took" : 62,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 0.92058265,
"hits" : [
{
"_index" : "my-test-index",
"_id" : "1",
"_score" : 0.92058265,
"_source" : {
"field1" : "running man",
"field2" : "other text"
}
}
]
}
}
我希望should条件的工作方式与OR条件类似,因此无论查询中字段的顺序如何,两个查询都应该返回结果。如果我从索引设置中删除keyword_repeat,一切都按预期工作,两个查询都能找到文档。
使用keyword_repeat过滤器的索引标记列表
curl -X GET "elasticsearch:9200/my-test-index/_termvectors/1?pretty&fields=field1,field2"
{
"_index" : "my-test-index",
"_id" : "1",
"_version" : 1,
"found" : true,
"took" : 116,
"term_vectors" : {
"field2" : {
"field_statistics" : {
"sum_doc_freq" : 2,
"doc_count" : 1,
"sum_ttf" : 4
},
"terms" : {
"other" : {
"term_freq" : 2,
"tokens" : [
{
"position" : 0,
"start_offset" : 0,
"end_offset" : 5
},
{
"position" : 0,
"start_offset" : 0,
"end_offset" : 5
}
]
},
"text" : {
"term_freq" : 2,
"tokens" : [
{
"position" : 1,
"start_offset" : 6,
"end_offset" : 10
},
{
"position" : 1,
"start_offset" : 6,
"end_offset" : 10
}
]
}
}
},
"field1" : {
"field_statistics" : {
"sum_doc_freq" : 3,
"doc_count" : 1,
"sum_ttf" : 4
},
"terms" : {
"man" : {
"term_freq" : 2,
"tokens" : [
{
"position" : 1,
"start_offset" : 8,
"end_offset" : 11
},
{
"position" : 1,
"start_offset" : 8,
"end_offset" : 11
}
]
},
"run" : {
"term_freq" : 1,
"tokens" : [
{
"position" : 0,
"start_offset" : 0,
"end_offset" : 7
}
]
},
"running" : {
"term_freq" : 1,
"tokens" : [
{
"position" : 0,
"start_offset" : 0,
"end_offset" : 7
}
]
}
}
}
}
}
我试着测试了不同版本的elasticsearch,得到了以下结果:
8.8.1 -按预期工作8.8.0 -按预期工作
8.7.1 -存在问题****8.7.0 -存在问题
8.6.2 -按预期工作。
1条答案
按热度按时间zbq4xfa01#
查询的顺序并不重要。所以下面的查询需要返回相同的结果。也许是因为refresh_interval,您第一次看到的结果是空的。
结果: