sql—3周移动平均值,回溯时间从2周的星期日开始到当前日期

pgccezyw  于 2021-07-24  发布在  Java
关注(0)|答案(4)|浏览(718)

我想找出三周的移动平均线。但这里的渔获量是指每天(本周)的渔获量,应追溯到2周的周日(从上周日开始到上周日,不包括本周周日),然后计算到当天。
下面是示例数据。
“日期”列中每天有一个条目。我们需要计算每个lob每个项目的移动平均值。为了简单起见,我只使用了一个项目、一个lob和三周的数据。

以下是预期输出:

说明:
对于7月20日,它从7月5日(2周前的周日)到7月20日获取数据。所以它把这段时间的销售额加起来,是44000,然后除以天数,是16。7月21日也是这样,从7月5日开始到7月21日,这段时间的累计销售额是45000天,天数是17天(所以45000/17=2647.05),以此类推。
我正在努力改变典型的窗口函数查询来实现这一点:

avg(sales) over(partition by project,lob order by date rows 21 preceding)
0qx6xfy6

0qx6xfy61#

这是一个teradata解决方案,可以很容易地移植到sqlserver(仅限于 td_sunday 是teradata语法,返回上一个星期日,用于定义星期)。逻辑很简单,基于avg=sum/count:计算每周数据,使用前两周的数据,并将当前周的累计销售额相加。

WITH cte AS
 (
   SELECT t.*
     -- sum/count per week 
     ,SUM  (sales) OVER (PARTITION BY lob, project, td_sunday(dt)) AS week_sum
     ,COUNT(sales) OVER (PARTITION BY lob, project, td_sunday(dt)) AS week_cnt
     -- cumulative sum/count for current week
     ,SUM  (sales) OVER (PARTITION BY lob, project, td_sunday(dt)
                         ORDER BY dt ROWS UNBOUNDED PRECEDING)     AS cum_sum
     ,COUNT(sales) OVER (PARTITION BY lob, project, td_sunday(dt)
                         ORDER BY dt ROWS UNBOUNDED PRECEDING)     AS cum_cnt
   FROM vt as t
 )
SELECT cte.*
  ,CASE -- both weeks before exist
      WHEN     LAG(week_cnt,14) OVER (PARTITION BY lob, project ORDER BY dt) = 7
      THEN
        CAST( -- sum of the previous two weeks + current week
             ( LAG(week_sum, 7) OVER (PARTITION BY lob, project ORDER BY dt) -- previous week
              +LAG(week_sum,14) OVER (PARTITION BY lob, project ORDER BY dt) -- two weeks before
              +cum_sum
             ) as decimal(38,3)
            ) / 
        -- number of rows in previous two weeks + current week
        (cum_cnt + 14) 
      ELSE 0
    END
FROM cte
ORDER BY dt;

在teradata中相当有效,explain只显示了两个*stat函数步骤。
顺便说一句,这不完全是你的预期结果,因为它返回了7月19日的平均值。但这可以通过一些额外的逻辑来解决。

zdwk9cvp

zdwk9cvp2#

您需要在表或时间维度中添加周数,然后根据周数而不是日期定义窗口。

pdsfdshx

pdsfdshx3#

t-sql(6月7日是前一个星期日):

with DateWindows as (
 select DateColumn, dateadd(wk, datediff(wk, '2020-06-07', DateColumn)-2, '2020-06-07') as SundayStartingDate
 from MyDateTable
)
select DW.DateColumn, DW.SundayStartingDate, count(*) as DaysCount, sum(Sales) as RollingSalesAmt, avg(Sales) as RollingAvgSalesAmt
from DateWindows DW
join MyDateTable T on DW.DateColumn between DW.SundayStartingDate and DW.DateColumn
group by DW.DateColumn between DW.SundayStartingDate
order by DW.DateColumn;

2020-07-20={44000,162750}2020-07-21={45000,172647}…等等。。。

wrrgggsh

wrrgggsh4#

暴力总是存在的:

select
    "date", sales, lob, project,
    sum(sales) over (partition by lob, project order by "date" rows between 13 preceding and current row) +
    case when datepart(weekday, "date") >= 1 then lag(sales, 14) over (partition by lob, project order by "date") else 0 end +
    case when datepart(weekday, "date") >= 2 then lag(sales, 15) over (partition by lob, project order by "date") else 0 end +
    case when datepart(weekday, "date") >= 3 then lag(sales, 16) over (partition by lob, project order by "date") else 0 end +
    case when datepart(weekday, "date") >= 4 then lag(sales, 17) over (partition by lob, project order by "date") else 0 end +
    case when datepart(weekday, "date") >= 5 then lag(sales, 18) over (partition by lob, project order by "date") else 0 end +
    case when datepart(weekday, "date") >= 6 then lag(sales, 19) over (partition by lob, project order by "date") else 0 end +   
    case when datepart(weekday, "date") >= 7 then lag(sales, 20) over (partition by lob, project order by "date") else 0 end
    ) / (14.0 + datepart(weekday, "date")) as rolling_avg
from t;

这是假设 @@datefirst 设置为星期天,因此根据需要进行调整。
https://rextester.com/rasth4795

相关问题