基于布尔查询的带过滤器的ElasticSearch问题

scyqe7ek  于 2021-06-14  发布在  ElasticSearch
关注(0)|答案(2)|浏览(380)

布尔查询中的筛选器有问题。
我想基于3个字段应用一个筛选器,其中至少有1个筛选器匹配:

$params = [
    'from' => 0,
    'size' => 25,
    'index' => 'document',
    'body' => [
        'query' => [
            'bool' => [
                'filter' => [
                    'bool' => [
                        'minimum_should_match' => 1,
                        'should' => [
                            'term' => [
                                'VISIBILITE' => 'T'
                            ],
                            'term' => [
                                'ECRITURE' => 'M'
                            ],
                            'term' => [
                                'LECTURE' => 'M'
                            ],
                        ]
                    ]
                ],
                'must' => [
                    [
                        'bool' => [
                            'should' => [ 
                                [
                                    'match' => [
                                        'OBJET' => $recherche,
                                    ]
                                ],
                            ] 
                        ]
                    ],
                ],
            ],
        ],
    ],
];

我没有得到这个查询的结果,但是我在索引中看到了很多相关的文档。
opster elasticsearch忍者测试:
例如,你向我提议的1,我有许多返回的结果。
然而,当我想对object字段执行一个must查询时,我并没有得到完全匹配过滤器的相同结果。
举个例子:
仅用must子句搜索

{
    "took": 8,
    "timed_out": false,
    "_shards": {
        "total": 2,
        "successful": 2,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 1268,
            "relation": "eq"
        },
        "max_score": 13.616098,
        "hits": [
            {
                "_index": "document",
                "_type": "_doc",
                "_id": "26685",
                "_score": 13.616098,
                "_source": {
                    "NUMDOCUMENT": "26685",
                    "TYPEDOCUMENT": "Proc\u00e9dure",
                    "OBJET": "Proc\u00e9dure d'importation des index dans Marco 2",
                    "MOTCLES": "",
                    "LECTURE": "S",
                    "VISIBILITE": "T", // Must match on second search
                    "ECRITURE": "M" // Must match on second search
                }
            }
        ]
    }
}

用must子句和filter搜索

{
    "took": 9,
    "timed_out": false,
    "_shards": {
        "total": 2,
        "successful": 2,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 10000,
            "relation": "gte"
        },
        "max_score": 0,
        "hits": [
            {
                "_index": "document",
                "_type": "_doc",
                "_id": "431",
                "_score": 0,
                "_source": {
                    "NUMDOCUMENT": "431",
                    "TYPEDOCUMENT": "Document",
                    "OBJET": "Diagnostic informatique SAFC",
                    "LECTURE": "M",
                    "VISIBILITE": "T",
                    "ECRITURE": "M"
                }
            }
        ]
    }
}

它不再是第一个出现的同一个文档(尽管该文档对应于过滤器)。就好像搜索过滤器会影响搜索结果的分数和相关性。

rslzwgfq

rslzwgfq1#

我找到了解决办法。我忘了一个钩子应该过滤。
不好的:

'bool' => [
                'filter' => [
                    'bool' => [
                        'minimum_should_match' => 1,
                        'should' => [
                            'term' => [
                                'VISIBILITE' => 'T'
                            ],
                            'term' => [
                                'ECRITURE' => 'M'
                            ],
                            'term' => [
                                'LECTURE' => 'M'
                            ],
                        ]
                    ]
                ],

好的:

'bool' => [
                'filter' => [
                    'bool' => [
                        'minimum_should_match' => 1,
                        'should' => [[ // Double hook
                            'term' => [
                                'VISIBILITE' => 'T'
                            ],
                            'term' => [
                                'ECRITURE' => 'M'
                            ],
                            'term' => [
                                'LECTURE' => 'M'
                            ],
                        ]]
                    ]

                ],
6tdlim6h

6tdlim6h2#

问题似乎出在你自己身上 bool 查询,如果您在顶层查看您的查询,您有两个构造
具有3个should条件的筛选器块,其中至少有1个应匹配,这将筛选ie减少下一个应匹配的文档集 must 条款将被执行。
必须阻止,我怀疑没有匹配步骤1中的简化文档集上的任何内容,这导致您的查询没有返回任何内容。
为了调试这个问题,您应该独立地尝试第一个块,然后再进行merge,以查看您是否得到了结果 must 块没有正确的数据,我创建了以下示例,显示如果有正确的数据,它将返回数据:

{
    "query": {
        "bool": {
            "should": [
                {
                    "term": {
                        "VISIBILITE": "T"
                    }
                },
                {
                    "term": {
                        "ECRITURE": "T"
                    }
                },
                {
                    "term": {
                        "LECTURE": "T"
                    }
                }
            ],
            "minimum_should_match": 1
        }
    }
}

以及搜索查询结果,其中显示 _source 匹配单据数量

"hits": [
            {
                "_index": "minshouldmatch",
                "_type": "_doc",
                "_id": "2",
                "_score": 1.5686158,
                "_source": {
                    "VISIBILITE": "T", 
                    "ECRITURE": "T",
                    "LECTURE": "T"
                }
            },
            {
                "_index": "minshouldmatch",
                "_type": "_doc",
                "_id": "1",
                "_score": 0.18232156,
                "_source": {
                    "VISIBILITE": "T", // note even only 1 condition matches still it comes in SR
                    "ECRITURE": "M",
                    "LECTURE": "M"
                }
            }
        ]

相关问题