如何筛选具有相同嵌套属性值的文档

wvmv3b1j 于 2021-06-10 发布在 ElasticSearch

关注(0)|答案(1)|浏览(463)

考虑到下面的文档Map，如何筛选 source-string 包含至少2个相同 targetStrings.score 价值观？

PUT /source-string
{
  "mappings": {
    "properties": {
      "id": { "type": "keyword" },
      "targetStrings": {
        "type": "nested",
        "properties": {
          "id": { "type": "keyword" },
          "score": { "type": "integer" }
        }
      }
    }
  }
}

带1的索引示例 source-string 其中包含 targetStrings 有两个相同的 score 第1页。我想把这个还给你。

"_index" : "source-string",
        "_type" : "_doc",
        "_id" : "VHS796CQKuFZo2GPmb1T",
        "_source" : {
          "id" : "VHS796CQKuFZo2GPmb1T",
          "targetStrings" : [
            {
              "score" : 1,
              "id" : "id1"
            },
            {
              "score" : 2,
              "id" : "id2"
            },
            {
              "score" : 1,
              "id" : "id3"
            }
          ]
        }
      }

elasticsearch

来源：https://stackoverflow.com/questions/64904154/how-to-filter-docs-with-identical-nested-property-values

1条答案

按热度按时间

rbl8hiat1#

你可以用 min_doc_count 使用术语聚合，它将返回匹配超过配置的命中数的术语
添加索引数据、搜索查询和搜索结果的工作示例
索引数据：

{
  "id": "VHS796CQKuFZo2GPmb1W",
  "targetStrings": [
    {
      "score": 3,
      "id": "id1"
    },
    {
      "score": 2,
      "id": "id2"
    },
    {
      "score": 1,
      "id": "id3"
    }
  ]
}
{
  "id": "VHS796CQKuFZo2GPmb1T",
  "targetStrings": [
    {
      "score": 1,
      "id": "id1"
    },
    {
      "score": 2,
      "id": "id2"
    },
    {
      "score": 1,
      "id": "id3"
    }
  ]
}

搜索查询：

{
  "size": 0,
  "aggs": {
    "id_terms": {
      "terms": {
        "field": "id"
      },
      "aggs": {
        "nested_entries": {
          "nested": {
            "path": "targetStrings"
          },
          "aggs": {
            "targetStrings": {
              "terms": {
                "field": "targetStrings.score",
                "min_doc_count": 2
              }
            }
          }
        }
      }
    }
  }
}

搜索结果：

"aggregations": {
    "id_terms": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "VHS796CQKuFZo2GPmb1T",
          "doc_count": 1,
          "nested_entries": {
            "doc_count": 3,
            "targetStrings": {
              "doc_count_error_upper_bound": 0,
              "sum_other_doc_count": 0,
              "buckets": [
                {
                  "key": 1,
                  "doc_count": 2            <-- note this
                }
              ]
            }
          }
        },
        {
          "key": "VHS796CQKuFZo2GPmb1W",
          "doc_count": 1,
          "nested_entries": {
            "doc_count": 3,
            "targetStrings": {
              "doc_count_error_upper_bound": 0,
              "sum_other_doc_count": 0,
              "buckets": []
            }
          }
        }
      ]
    }

更新1：
如果只想检索那些具有完全相同的2个分数的文档，可以使用bucket选择器聚合

{
  "size": 0,
  "aggs": {
    "id_terms": {
      "terms": {
        "field": "id"
      },
      "aggs": {
        "nested_entries": {
          "nested": {
            "path": "targetStrings"
          },
          "aggs": {
            "targetStrings": {
              "terms": {
                "field": "targetStrings.score"
              },
              "aggs": {
                "count_filter": {
                  "bucket_selector": {
                    "buckets_path": {
                      "values": "_count"
                    },
                    "script": "params.values == 2"
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

赞(0）回复(0）举报 2021-06-10

我来回答

如何筛选具有相同嵌套属性值的文档

1条答案

相关问题

热门标签

最新问答