雪花查询中带窗口函数的sql条件分组

lnxxn5zx  于 2021-08-13  发布在  Java
关注(0)|答案(2)|浏览(346)

我有一张雪花表,格式如下:

create temp_test(name string, split string, value int)

insert into temp_test
values ('A','a', 100), ('A','b', 200), ('A','c',300), ('A', 'd', 400), ('A', 'e',500), ('B', 'a', 1000), ('B','b', 2000), ('B','c', 3000), ('B', 'd',4000), ('B','e', 5000)

第一步,我只需要每个名称的前2个值(按值排序),所以我使用以下查询得到:

select name, split, value,
row_number() over (PARTITION BY (name) order by value desc) as row_num 
from temp_test
qualify row_num <= 2

结果如下:

NAME    SPLIT   VALUE   ROW_NUM
A       e       500     1
A       d       400     2
B       e       5000    1
B       d       4000    2

现在,我需要求除top 2以外的值的和,并将其放在另一个名为“others”的拆分中,如下所示:

NAME    SPLIT   VALUE   
A       e       500     
A       d       400     
A       Others  600        
B       e       5000    
B       d       4000
B       Others  6000

一般来说,如何在snowflake查询或sql中实现这一点?

jgovgodb

jgovgodb1#

with data as (
    select name, split, value,
        row_number() over (partition by (name) order by value desc) as row_num 
    from temp_test
)
select
    name,
    case when row_num <= 2 then split else 'Others' end as split,
    sum(value) as value
from data
group by name, case when row_num <= 2 then row_num else 3 end
sdnqo3pr

sdnqo3pr2#

shawnt00的答案是好的,但是对于雪花的记录来说,这可以写得更简单:
首先,末尾的group by可以按索引或名称引用结果:

GROUP BY 1,2

GROUP BY name, split

此外,由于该案件只有太多的分支,一个敌我识别码可以使用,似乎你正在使用一个cte添加行\你可以把敌我识别码到cte也

WITH data AS (
    SELECT name, value,
        ROW_NUMBER() OVER (PARTITION BY name ORDER BY value DESC) AS row_num,
        IFF(row_num < 3, split, 'Others') as n_split
    FROM VALUES ('A','a', 100), ('A','b', 200), ('A','c',300), ('A', 'd', 400), 
                ('A', 'e',500), ('B', 'a', 1000), ('B','b', 2000), ('B','c', 3000), 
                ('B', 'd',4000), ('B','e', 5000) 
    v(name, split, value)
)   
SELECT
    name,
    n_split,
    SUM(value) AS value
FROM data
GROUP BY name, n_split;

如果超级热衷于小型sql,请将行\u号放入iff:

WITH data AS (
    SELECT name, value,
        IFF(ROW_NUMBER() OVER (PARTITION BY name ORDER BY value DESC) < 3, split, 'Others') as n_split
    FROM VALUES ('A','a', 100), ('A','b', 200), ('A','c',300), ('A', 'd', 400), 
                ('A', 'e',500), ('B', 'a', 1000), ('B','b', 2000), ('B','c', 3000), 
                ('B', 'd',4000), ('B','e', 5000) 
    v(name, split, value)
)   
SELECT
    name,
    n_split AS split,
    SUM(value) AS value
FROM data
GROUP BY name, n_split;

给予:

NAME    SPLIT   VALUE
A       e       500
A       d       400
A       Others  600
B       e       5000
B       d       4000
B       Others  6000

相关问题