我想使用es计算用户保留率:
1、事件日志到默认索引
2、向中间索引转换:以实体为中心的数据,按acc分组
3、使用aggs过滤器(或邻接矩阵)计算每天的相交结果。
问题在第二步:如何生成一个好的转换
输入事件日志:
POST _bulk
{"index": {"_index": "test.u1"}}
{"acc":1001, "event":"create", "timestamp":"2020-08-01 09:00"}
{"index": {"_index": "test.u1"}}
{"acc":1001, "event":"login", "timestamp":"2020-08-01 10:00"}
{"index": {"_index": "test.u1"}}
{"acc":1001, "event":"login", "timestamp":"2020-08-02 10:00"}
{"index": {"_index": "test.u1"}}
{"acc":1001, "event":"login", "timestamp":"2020-08-03 10:00"}
{"index": {"_index": "test.u1"}}
{"acc":1002, "event":"create", "timestamp":"2020-08-01 10:00"}
{"index": {"_index": "test.u1"}}
{"acc":1002, "event":"login", "timestamp":"2020-08-02 10:00"}
{"index": {"_index": "test.u1"}}
{"acc":1002, "event":"login", "timestamp":"2020-08-02 11:00"}
{"index": {"_index": "test.u1"}}
{"acc":1003, "event":"create", "timestamp":"2020-08-01 10:00"}
{"index": {"_index": "test.u1"}}
{"acc":1004, "event":"create", "timestamp":"2020-08-02 10:00"}
{"index": {"_index": "test.u1"}}
{"acc":1004, "event":"login", "timestamp":"2020-08-02 10:00"}
{"index": {"_index": "test.u1"}}
{"acc":1004, "event":"login", "timestamp":"2020-08-03 10:00"}
期望中间索引:
{"acc":1001, "create":"08-01", "login":[08-01, 08-02, 08-03]}
{"acc":1002, "create":"08-01", "login":[08-02]}
{"acc":1003, "create":"08-01", "login":[]}
{"acc":1004, "create":"08-02", "login":[08-02, 08-03]}
如何生成“login”数组?或者欢迎任何更好的设计。
1条答案
按热度按时间bksxznpy1#
通过aggs.scripted\u metric完成
中间指标:
最后,通过kql过滤器,2020-08-18 2020-08-19的用户保留很容易: