ElasticSearch:查询最佳匹配优先

dpiehjr4  于 2022-12-03  发布在  ElasticSearch
关注(0)|答案(1)|浏览(204)

我正在研究如何使用几个单词从数据库中查询文章。它应该返回一个文章列表(最佳匹配优先)。
我有这样一些文章:

{
"article": "Lorem Ipsum is simply dummy text of the printing and typesetting industry.
 Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an
 unknown printer took a galley of type and scrambled it to make a type specimen book. 
It has survived not only five centuries, but also the leap into electronic typesetting,
 remaining essentially unchanged."
}

要搜索的字词:“标准”、“行业”、“打印机”、...
因此响应应该是包含这些词中的一些词的文章列表,这些词按最匹配排序。
在这种情况下,索引和搜索它的最佳方法是什么?
谢谢你

643ylb08

643ylb081#

最好的办法是将字段索引为类型:“text”,然后对该字段运行匹配查询。
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-match-query.html
这将返回基于TF-IDF的所有文档,TF-IDF为:

  • 在文档中出现频率越高,得分越高
  • 该术语在所有数据集中出现频率越高,得分越低

创建索引:

PUT test_articles
{
  "mappings": {
    "properties": {
      "article": {
        "type": "text"
      }
    }
  }
}

添加一些文档:

POST test_articles/_doc
{
"article": "Lorem Ipsum is simply dummy text of the printing and typesetting industry.Lorem Ipsum has been the industry's."
}

正在查询:

GET test_articles/_search
{
  "query": {
    "match": {
      "article": "printing"
    }
  }
}

从这里您可以开始添加更多内容,例如:

  • 排印支持
  • 同义词
  • 去梗等。

欢迎使用Elasticsearch

相关问题