从开始和结束日期获取活动周数

siotufzp  于 2021-06-24  发布在  Hive
关注(0)|答案(3)|浏览(333)

我有如下所示的订阅数据。数据显示用户何时购买了订阅。是的 user_id,subscription_id,start date and end_date .
我得到了 wk_start 以及 wk_end 从它那里。

+------+-----------------+------------+------------+----------+--------+
| user | subscription_id |   start    |    end     | wk_start | wk_end |
+------+-----------------+------------+------------+----------+--------+
|    1 | 1A              | 2019-06-01 | 2019-06-30 |       22 |     27 |
|    2 | 2A              | 2019-06-01 | 2019-06-21 |       22 |     25 |
|    3 | 3A              | 2019-06-01 | 2019-06-21 |       22 |     24 |
|    1 | 1B              | 2019-07-04 | 2019-08-04 |       27 |     29 |
|    2 | 2B              | 2019-07-01 | 2019-07-31 |       27 |     31 |
+------+-----------------+------------+------------+----------+--------+

现在我想用这样一种方式来转换数据 user,subscription_id ,我将获取特定用户的活动周数。
所需输出如下

+------+-----------------+------+
| user | subscription_id | Week |
+------+-----------------+------+
|    1 | 1A              |   22 |
|    1 | 1A              |   23 |
|    1 | 1A              |   24 |
|    1 | 1A              |   25 |
|    1 | 1A              |   26 |
|    1 | 1A              |   27 |
|    2 | 2A              |   22 |
|    2 | 2A              |   23 |
|    2 | 2A              |   24 |
|    2 | 2A              |   25 |
|    3 | 3A              |   22 |
|    3 | 3A              |   23 |
|    3 | 3A              |   24 |
|    1 | 1B              |   27 |
|    1 | 1B              |   28 |
|    1 | 1B              |   29 |
|    2 | 2B              |   27 |
|    2 | 2B              |   28 |
|    2 | 2B              |   29 |
|    2 | 2B              |   30 |
|    2 | 2B              |   31 |
+------+-----------------+------+

它基本上扩展了订阅数据以获得每个用户的活动周数。
例如 user 1subscription_id 1A 从22周到27周都很活跃。
同样地 user 2subscription_id 2A 从第22周到第25周都很活跃。
现在 user 2 从第27周到第31周再次活跃 subscription_id 2B 我希望这有助于理解预期的结果
我在这里做了一个db fiddle(虽然它是在mysql中),但是应该在某些方面有所帮助

kuhbmx9i

kuhbmx9i1#

您可以创建一个包含所有周数的表,并对其执行部分笛卡尔联接,如下所示:

CREATE TABLE weeks (
  `week` INTEGER 
);

INSERT INTO weeks
  (`week`)
VALUES
         (1),  (2),  (3),  (4),  (5),  (6),  (7),  (8),  (9)
, (10), (11), (12), (13), (14), (15), (16), (17), (18), (19)
, (20), (21), (22), (23), (24), (25), (26), (27), (28), (29)
, (30), (31), (32), (33), (34), (35), (36), (37), (38), (39)
, (40), (41), (42), (43), (44), (45), (46), (47), (48), (49)
, (50), (51), (52), (53)
;

SELECT t1.user, t1.subscription_id, w.week
FROM table1 t1
CROSS JOIN weeks w
WHERE w.week BETWEEN t1.wk_start and t1.wk_end
ORDER BY t1.user, t1.subscription_id, w.week;

https://www.db-fiddle.com/f/mn4towxxscmlsibqjwtneh/0
weeks表看起来有点麻烦,但是只需要创建一次就可以了。

3bygqnnd

3bygqnnd2#

您可以使用内存堆栈并与其连接:

with weeks as(
select stack (53, 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,
                  21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,
                  41,42,43,44,45,46,47,48,49,50,51,52,53  
) as week_number
)

SELECT t1.user, t1.subscription_id, w.week_number
  FROM table1 t1
       CROSS JOIN weeks w
 WHERE w.week_number BETWEEN t1.wk_start and t1.wk_end
 ORDER BY t1.user, t1.subscription_id, w.week_number;

此外,您还可以生成周,这也将快速工作:

with weeks as(
select i week_number 
from(select posexplode(split(space(53),' ')) as (i,x)) s  
)

还有一个使用array+explode的方法:

with weeks as(
select explode(array (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,
                  21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,
                  41,42,43,44,45,46,47,48,49,50,51,52,53)) as week_number
)
5ktev3wc

5ktev3wc3#

这应该能奏效

select user,subscription_id, t.wk_end - pe.i as week
from (
select 1 as user, "1A" as subscription_id , 22 as wk_start, 27 as wk_end
union all  
select 2 as user, "2A" as subscription_id , 22 as wk_start, 25 as wk_end
union all  
select 3 as user, "3A" as subscription_id , 22 as wk_start, 24 as wk_end
union all  
select 1 as user, "1B" as subscription_id , 27 as wk_start, 29 as wk_end
union all  
select 2 as user, "2B" as subscription_id , 27 as wk_start, 31 as wk_end
) t
lateral view posexplode(split(space(wk_end - wk_start),' ')) pe as i,s
;

希望有帮助

相关问题