elasticsearch-使用最小匹配百分比作为字段总术语的参考

hwamh0ep 于 2021-06-10 发布在 ElasticSearch

关注(0)|答案(2)|浏览(539)

当我查询包含在另一个短语中的短语时，我试图避免假阳性结果。
我希望通过使用minimum\u should\u match参数，我能够根据字段的总数将其设置为最少项。

{
   "match": {
       "notices.title": {
           "query": "Juan Pedro",
           "minimum_should_match": "-1"
        }
   }
}

预期结果与a.title=“dr.juan pedro”匹配，但与b.title=“dr.juan pedro pan”匹配。如您所见，根据a中术语的总量，查询匹配的是-1，b中匹配的是-2。
我已经阅读了文档，并且知道参数是计算查询中子句总数所需的最小值，但是我希望有一种方法可以参照字段的总项来实现这一点。
有什么想法吗？谢谢！
更新
按照@prernagupta所述的解决方案，为了避免在查询中创建数量可变的匹配项，我最终使用了matchphrase。然后我使用字符串+1中的标记数量与创建的title.length字段进行比较。这似乎奏效了。让我知道如果你相信它可以产生任何其他错误，我没有看到。

"bool": {
    "must": [
         {
           "match_phrase": {
                 "notices.title": {
                     "query": "Juan Pedro"
                  }
            }
          },
          {
            "term": {
                "notices.title.length": 3
             }
          }
     ]
}

再次感谢！

elasticsearch Database indexing search Document

来源：https://stackoverflow.com/questions/64053504/elasticsearch-use-minimum-should-match-percentage-in-reference-to-the-field-to

2条答案

按热度按时间

hjzp0vay1#

你可以用“必须”和“不能”
“minimum \u should \u match”：“-1”选项为“juan”或“pedro”匹配

赞(0）回复(0）举报 2021-06-11

lsmd5eda2#

你可以用 token_count 字段数据类型以达到您的最低\u应\u匹配标准。

Map：

"mappings": {
            "properties": {
                "notices": {
                    "properties": {
                        "title": {
                            "type": "text",
                            "fields": {
                                "length": {
                                    "type": "token_count",
                                    "analyzer": "whitespace"
                                }
                            }
                        }
                    }
                }
            }
        }

索引数据：

{"notices.title": "Dr. Juan Pedro"}
{"notices.title": "Dr. Juan Pedro Pan"}
{"notices.title": "Dr. Juan abc"}

搜索查询：

{
    "query": {
        "bool": {
            "must": [
                {
                    "match": {
                        "notices.title": "Juan"
                    }
                },
                {
                    "match": {
                        "notices.title": "Pedro"
                    }
                },
                {
                    "term": {
                        "notices.title.length": 3
                    }
                }
            ]
        }
    }
}

搜索结果：

"hits": [
            {
                "_index": "notice",
                "_type": "_doc",
                "_id": "1",
                "_score": 1.6292782,
                "_source": {
                    "notices.title": "Dr. Juan Pedro"
                }
            }
        ]

您可以在这里编辑 notices.title.length 包含所需术语总数的值，包括“juan”和“pedro”。

赞(0）回复(0）举报 2021-06-10

我来回答

elasticsearch-使用最小匹配百分比作为字段总术语的参考

2条答案

Map：

索引数据：

搜索查询：

搜索结果：

相关问题

热门标签

最新问答