SQL Server Calculate rolling average of a measure based on certain condition from another column

5vf7fwbs 于 2023-03-17 发布在其他

关注(0)|答案(1)|浏览(122)

Below is the given table.
| date | id1 | id2 | fact | boolean |
| ------------ | ------------ | ------------ | ------------ | ------------ |
| 2022-01-01 | i1 | l1 | 100 | 0 |
| 2022-01-02 | i1 | l1 | 10 | 1 |
| 2022-01-03 | i1 | l1 | 110 | 0 |
| 2022-01-04 | i1 | l1 | 70 | 0 |
| 2022-01-05 | i1 | l1 | 20 | 1 |
| 2022-01-05 | i1 | l1 | 30 | 1 |
| 2022-01-06 | i1 | l1 | 40 | 0 |
| 2022-01-07 | i1 | l1 | 50 | 0 |
| 2022-01-08 | i1 | l1 | 70 | 0 |
| 2022-01-09 | i1 | l1 | 100 | 1 |
| 2022-01-09 | i1 | l1 | 10 | 1 |
| 2022-01-10 | i1 | l1 | 200 | 0 |

Expected outcome for rolling average:

date	id1	id2	fact	boolean	rolling_avg
2022-01-01	i1	l1	100	0
2022-01-02	i1	l1	10	1	100
2022-01-03	i1	l1	110	0
2022-01-04	i1	l1	70	0
2022-01-05	i1	l1	20	1	90
2022-01-05	i1	l1	30	1	90
2022-01-06	i1	l1	40	0
2022-01-07	i1	l1	50	0
2022-01-08	i1	l1	70	0
2022-01-09	i1	l1	100	1	53.33
2022-01-09	i1	l1	10	1	53.33
2022-01-10	i1	l1	200	0

Rolling avg of 'fact' column for preceding 3 days required only for rows where boolean = 1 and while calculating the average consider only those values where boolean is 0.

Example1: for row 5, preceding 3 days are 4, 3 and 2 where boolean is 0 for 4 and 3. Hence (110+70/2) = 90

Example2: for row 9, preceding 3 days are 8, 7 and 6 where all have boolean = 0. Hence (70+50+40/3) = 53.33

I've tried doing

AVG(expression) OVER (PARTITION BY id1, id2 
                      ORDER BY date DESC 
                      ROWS BETWEEN 3 PRECEDING AND CURRENT ROW) as rolling_avg

However the result I got didn't match the required outcome.

sql-server

来源：https://stackoverflow.com/questions/75705967/calculate-rolling-average-of-a-measure-based-on-certain-condition-from-another-c

1条答案

按热度按时间

fgw7neuy1#

If you dates may have gaps, then I would recommend a lateral join:

select t.*, a.*
from mytable t
outer apply (
    select avg(t1.fact) rolling_avg
    from mytable t1
    where t.boolean = 1
        and t1.id1 = t.id1 and t1.id2 = t.id2 
        and t1.date >= dateadd(day, -3, t.date) and t1.date < t.date
        and t1.boolean = 0
) a

The subquery selects rows that belong to the same ids over the 3 previous days and computes the average fact, with respect to the boolean flag of the current and previous rows.

If there is always one and only one fact row per date for a given tuple of ids, then window functions with a row frame are relevant (and more efficient):

select t.*, 
    case when boolean = 1 then
        avg(case when boolean = 0 then fact end) over(
            partition by id1, id2 
            order by date
            rows between 3 preceding and 1 preceding
        ) 
    end rolling_avg
from mytable t

赞(0）回复(0）举报 2023-03-17

我来回答

SQL Server Calculate rolling average of a measure based on certain condition from another column

1条答案

相关问题

热门标签

最新问答