elasticsearch:选择多个不同字段按id分组

bqf10yzr  于 2021-06-10  发布在  ElasticSearch
关注(0)|答案(2)|浏览(517)

我用下面格式的文档建立了es索引,这些文档是使用fluentd解析器从日志中解析出来的,并在es中建立索引。记录格式:
{“id”:“id1”,“field1”:“f1\u val”,“message”:“x”,“time”:“x”}
{“id”:“id1”,“field2”:“f2\u val”,“message”:“x”,“time”:“x”}
{“id”:“id1”,“field3”:“f3\u val”,“field4”:“f4\u val”,“message”:“x”,“time”:“x”}
我想按id字段分组,并将字段分组在一起,以便在kibana Jmeter 板中将数据可视化为一个表,如下所示:
{“id”:“id1”,“field1”:“f1\u val”,“field2”:“f2\u val”,“field3”:“f3\u val”,“field4”:“f4\u val”}
在Kibana:

Id     Field1     Field2     Field3     Field4
    id1    f1_val     f2_val     f3_val     f4_val

如何在elasticsearch中按文档id分组并选择不同的字段值。谢谢!

cuxqih21

cuxqih211#

由于elasticsearch不支持那么好的连接,因此在扩展kibana中,我建议您在将文档放入索引之前将文档连接到应用程序中。如果这不是一种可能性,我会按照以下建议进行转换:
https://discuss.elastic.co/t/combine-multiple-document-into-one-document-with-limited-fields-merging-of-documents/231758
使用这个,我可以在我的 Jmeter 板上实现这样的效果:结果图像
复制步骤:
创建日志索引

PUT log_index

添加一些数据

POST log_index/_doc/ {"id": "1", "field1": "The"}

    POST log_index/_doc/ {"id": "1", "field2": "quick"}

    POST log_index/_doc/ {"id": "1", "field3": "brown", "field4": "fox"}

    POST log_index/_doc/ {"id": "2", "field1": "jumped"}

    POST log_index/_doc/ {"id": "2", "field2": "over"}

    POST log_index/_doc/ {"id": "2", "field3": "the"}

    POST log_index/_doc/ {"id": "2", "field4": "lazy"}

将日志索引转换为联接索引(我很确定脚本化的度量可以写得更好。这是第一件成功的事):

PUT _transform/join_logs
    {
      "source": {
        "index": [ 
          "log_index"
        ]
      },
      "pivot": {
        "group_by": {
          "id.keyword": {
            "terms": {
              "field": "id.keyword"
            }
          }
        },
        "aggregations": { 
          "field1": {
            "scripted_metric": {
              "init_script": "state.docs = []",
              "map_script": "state.docs.add(new HashMap(params['_source']))",
              "combine_script": "for (t in state.docs) { if(t.get('field1') != null){ return t.get('field1')}}  return null",
              "reduce_script": "states"
            }
          },
          "field2": {
            "scripted_metric": {
              "init_script": "state.docs = []",
              "map_script": "state.docs.add(new HashMap(params['_source']))",
              "combine_script": "for (t in state.docs) { if(t.get('field2') != null){ return t.get('field2')}}  return null",
              "reduce_script": "states"
            }
          },
          "field3": {
            "scripted_metric": {
              "init_script": "state.docs = []",
              "map_script": "state.docs.add(new HashMap(params['_source']))",
              "combine_script": "for (t in state.docs) { if(t.get('field3') != null){ return t.get('field3')}}  return null",
              "reduce_script": "states"
            }
          },
          "field4": {
            "scripted_metric": {
              "init_script": "state.docs = []",
              "map_script": "state.docs.add(new HashMap(params['_source']))",
              "combine_script": "for (t in state.docs) { if(t.get('field4') != null){ return t.get('field4')}}  return null",
              "reduce_script": "states"
            }
          }
        }
      },
      "dest": { 
        "index": "joined_index"
      }
    }

运行转换
为联接索引创建索引模式
在“发现并创建表”中打开。保存并添加到 Jmeter 板。
我这样做的假设是,字段在每个具有指定id的文档中只出现一次。不知道如果字段在文档之间重叠会发生什么。

fjnneemd

fjnneemd2#

{
    "size": 0,
    "aggs": {
        "id_agg": {
            "terms": {
                "field": "id.keyword"
            },
            "aggs": {
                "by_field1": {
                    "terms": {
                        "field": "field1.keyword"
                    }
                },
               "by_field2": {
                    "terms": {
                        "field": "field2.keyword"
                    }
                },
               "by_field3": {
                    "terms": {
                        "field": "field3.keyword"
                    }
                }
            }
        }
    }
}

相关问题