elasticsearch:根据字段值查询对象的最大计数

piah890a  于 2021-06-10  发布在  ElasticSearch
关注(0)|答案(1)|浏览(328)

对于下面索引中的示例文档,我希望找到索引中所有文档中基于组件名称的最大操作计数。你能帮我想个办法吗。
假设索引中只存在一个文档,则得到预期结果:

comp1 -> action1 -> max 2 times
comp1 -> action2 -> max 1 time
comp2 -> action2 -> max 1 time
comp2 -> action3 -> max 1 time

示例文档:

{
  "id": "AC103902:A13A_AC140008:01BB_5FA2E8FA_1C08:0007",
  "tokens": [
    {
      "name": "comp1",
      "items": [
        {
          "action": "action1",
          "attr": "value"
        },
        {
          "action": "action1",
          "attr": "value"
        },
        {
          "action": "action2",
          "attr": "value"
        }
      ]
    },
    {
      "name": "comp2",
      "items": [
        {
          "action": "action2",
          "attr": "value"
        },
        {
          "action": "action3",
          "attr": "value"
        }
      ]
    }
  ]
}

elasticsearch版本:7.9我可以循环浏览每个文档并在客户端进行计算,但我很好奇是否已经有一个es查询可以帮助从索引中的文档中获取这个摘要。

jdgnovmf

jdgnovmf1#

您需要定义 tokens 数组和 tokens.items 阵列组件 nested 为了得到正确的数据。
然后,假设您的Map看起来与

{
  "mappings": {
    "properties": {
      "tokens": {
        "type": "nested",
        "properties": {
          "items": {
            "type": "nested"
          }
        }
      }
    }
  }
}

可以执行以下查询:

GET index_name/_search
{
  "size": 0,
  "aggs": {
    "by_token_name": {
      "nested": {
        "path": "tokens"
      },
      "aggs": {
        "token_name": {
          "terms": {
            "field": "tokens.name.keyword"
          },
          "aggs": {
            "by_max_actions": {
              "nested": {
                "path": "tokens.items"
              },
              "aggs": {
                "max_actions": {
                  "terms": {
                    "field": "tokens.items.action.keyword"
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

生产这些桶:

[
  {
    "key" : "comp1",              <--
    "doc_count" : 1,
    "by_max_actions" : {
      "doc_count" : 3,
      "max_actions" : {
        "doc_count_error_upper_bound" : 0,
        "sum_other_doc_count" : 0,
        "buckets" : [
          {
            "key" : "action1",    <--
            "doc_count" : 2
          },
          {
            "key" : "action2",    <--
            "doc_count" : 1
          }
        ]
      }
    }
  },
  {
    "key" : "comp2",              <--
    "doc_count" : 1,
    "by_max_actions" : {
      "doc_count" : 2,
      "max_actions" : {
        "doc_count_error_upper_bound" : 0,
        "sum_other_doc_count" : 0,
        "buckets" : [
          {
            "key" : "action2",    <--
            "doc_count" : 1
          },
          {
            "key" : "action3",    <--
            "doc_count" : 1
          }
        ]
      }
    }
  }
]

可以很容易地在客户端进行后期处理。

相关问题