如何在windowspec中定义多个range子句？

knsnq2tg 于 2021-06-26 发布在 Hive

关注(0)|答案(1)|浏览(200)

在sparksql中，可以使用2+order by列定义窗口查询，但似乎无法基于这些列定义range子句。
例如，

select
row_id,
count(*) over (
    partition by group_id
    order by filter_key1, filter_key2
    range between 12 preceding and 12 following
    range between 5 preceding and 1 preceding
) as the_count
from table

上面的方法失败了（尽管语法可能是关闭的？祈祷……）
可以用一个类似于上述的语句来完成吗？

sql Hive apache-spark apache-spark-sql hiveql

来源：https://stackoverflow.com/questions/36411726/how-to-define-multiple-range-clauses-in-windowspec

1条答案

按热度按时间

yks3o0rb1#

不，只允许一个范围。但不要绝望。 count(*) 是添加剂：

select row_id,
       (count(*) over (partition by group_id
                       order by filter_key1, filter_key2
                       range between 12 preceding and 12 following
                      ) +
        count(*) over (partition by group_id
                       order by filter_key1, filter_key2
                       range between 5 preceding and 1 preceding
                      )
       ) as the_count
from table

这个特别的例子看起来很奇怪，因为范围是重叠的。也许这就是你的意图。
基于你的问题，我想知道你是否想要：

select row_id,
       (count(*) over (partition by group_id
                       order by filter_key1
                       range between 12 preceding and 12 following
                      ) +
        count(*) over (partition by group_id
                       order by filter_key2
                       range between 5 preceding and 1 preceding
                      )
       ) as the_count
from table

赞(0）回复(0）举报 2021-06-26

我来回答

如何在windowspec中定义多个range子句？

1条答案

相关问题

热门标签

最新问答