Kibana Elasticsearch group by查询不会返回所有文档

a64a0gku 于 2023-04-10 发布在 Kibana

关注(0)|答案(1)|浏览(310)

我使用弹性6.7.1在我的索引中有一个字段cid，我试图用这个字段将它们分组。

GET my_index/_search
{
  "size": 0,
  "aggs": {
    "group_by_cid": {
      "terms": {
        "field": "cid",
        "order": {
          "_count": "desc"
        }
      }
    }
  }
}

它显示了5个计数大于1的结果，其他结果的计数为1，但我知道还有另一组值为21的cid的计数为2，并且没有显示在结果中。

{
  "took" : 101,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 530161,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "aggregations" : {
    "group_by_cid" : {
      "doc_count_error_upper_bound" : 5,
      "sum_other_doc_count" : 530143,
      "buckets" : [
        {
          "key" : "15929",
          "doc_count" : 2
        },
        {
          "key" : "29",
          "doc_count" : 2
        },
        {
          "key" : "4781",
          "doc_count" : 2
        },
        {
          "key" : "48387",
          "doc_count" : 2
        },
        {
          "key" : "547",
          "doc_count" : 2
        },
        {
          "key" : "0303141597270",
          "doc_count" : 1
        },
        {
          "key" : "0404091598267",
          "doc_count" : 1
        },
        {
          "key" : "0606021000357",
          "doc_count" : 1
        },
        {
          "key" : "0606021000359",
          "doc_count" : 1
        },
        {
          "key" : "0606021000364",
          "doc_count" : 1
        }
      ]
    }
  }
}

所以我添加了以下过滤器

"min_doc_count": 2

现在它显示了相同的5个文档，仍然没有21个。
最后当我把这个添加到查询中时

"query": {
    "term": {
      "cid": {
        "value": "21"
      }
    }
  },

它显示cid21的组，并显示计数2。

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 2,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "aggregations" : {
    "group_by_cid" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "21",
          "doc_count" : 2
        }
      ]
    }
  }
}

为什么在第一个查询中没有显示？我如何获得所有计数超过2的组？

kibana

来源：https://stackoverflow.com/questions/75945865/elasticsearch-group-by-query-does-not-return-all-the-documents

1条答案

按热度按时间

omvjsjqw1#

我可以看到你有非零的doc_count_error_upper_bound和sum_other_doc_count，这意味着你的一些桶已经被丢弃了（在上面的链接中详细解释了原因）
为了改善这种情况，你可以指定一个更大的shard_size来获得更准确的结果。默认情况下，shard_size = 1.5 * size + 10，因为size默认为10，这意味着shard_size是25，如果你的一些文档现在正确地分散在你的分片中和/或如果你的一些桶太大，这个值可能会太低。
像这样尝试并增加shard_size，直到获得所需的结果：

GET my_index/_search
{
  "size": 0,
  "aggs": {
    "group_by_cid": {
      "terms": {
        "field": "cid",
        "shard_size": 100,          <-----  add this
        "order": {
          "_count": "desc"
        }
      }
    }
  }
}

赞(0）回复(0）举报 2023-04-10

我来回答

Kibana Elasticsearch group by查询不会返回所有文档

1条答案

相关问题

热门标签

最新问答