如何按带有单独逗号的列分组

li9yvcax  于 2021-07-24  发布在  Java
关注(0)|答案(1)|浏览(304)
SELECT
    account_id,
    app_name,
    triggered_signatures,
    DATE,
    COUNT(*) AS cnt
FROM "public"."bus_request"
WHERE DATE >= '2020-06-22' AND triggered_signatures IS NOT NULL AND triggered_signatures != ''
GROUP BY account_id, app_name, triggered_signatures, DATE
ORDER BY account_id, cnt, DATE DESC

我得到:

aaaaa   uuuu    200004025             2020-06-22      1
bbbbb   rrrr    200003092,200004268   2020-06-23     12
bbbbb   kkkk    200004268             2020-06-23    412

如何获取每个触发的\u签名的记录数(以逗号分隔)?
我想得到:

aaaaa   uuuu    200004025     2020-06-22      1
bbbbb   rrrr    200004268     2020-06-23      7
bbbbb   rrrr    200003092     2020-06-23      5
bbbbb   kkkk    200004268     2020-06-23    412

例如

zqdjd7g9

zqdjd7g91#

基本上需要取消对值数组的嵌套。在redshift中没有很好的方法可以做到这一点-必须将表与索引表交叉联接,然后在每个元素中查找索引。
这与这里描述的取消测试json数组的问题相同https://stackoverflow.com/a/30198745/1680826,而不是 JSON_EXTRACT_ 函数,您应该使用 REGEXP_COUNT(triggered_signatures, ',') 获取数组长度,以及 SPLIT_PART(triggered_signatures, ',', index) 获取数组元素。
我希望这有帮助。
查询示例:

with input (account_id, app_name, triggered_signatures, date, cnt) as (
    select 'aaaaa', 'uuuu', '200004025', '2020-06-22', 1 
    union all
    select 'bbbbb', 'rrrr', '200003092,200004268', '2020-06-23', 12
    union all
    select 'bbbbb', 'kkkk', '200004268', '2020-06-23', 412
), sequence (index) as (
    select 1 union all
    select 2 union all
    select 3 union all
    select 4
)
select
    account_id,
    app_name,
    split_part(triggered_signatures, ',', index) as triggered_signautres,
    date,
    cnt
from input cross join sequence
where regexp_count(triggered_signatures, ',') + 1 >= index
;

相关问题