行编号和子分组

5jvtdoz2  于 2021-07-29  发布在  Java
关注(0)|答案(3)|浏览(352)

我希望有人能帮忙;我把自己归类为oracle/sql的新手,但到目前为止,我已经设法得到了我需要的东西,但在如何处理我的查询方面遇到了困难。
我有一个activites的数据集,每个activity都有一个惟一的id,在其整个生命周期中是一致的;每个活动都有多个以时间表示的事件;每个事件可以有不同的状态。请参见下面的示例集。
我想要实现的是一个列表,其中包含按活动id和时间排序的数据,每个活动都有一个增量id(1、2、3、4);但是我还需要一个第二列,它从1开始,当状态与前一行不同时递增。
下面是我的数据示例:

ACTIVITY_ID | EVENT_TIMESTAMP      | EVENT_STATUS
    --------------------------------------------------------
    A001        | 01/01/2020 09:00:00  | STATUS A
    A001        | 01/01/2020 10:10:00  | STATUS B
    A001        | 01/01/2020 11:20:00  | STATUS C
    A001        | 01/01/2020 12:30:00  | STATUS C
    A002        | 01/01/2020 13:40:00  | STATUS F
    A002        | 01/01/2020 17:50:00  | STATUS F
    A002        | 01/01/2020 17:53:00  | STATUS G

利用行号和分区,我得到了一个输出,它给出了我的有序列表,如下所示:

ACTIVITY_ID | EVENT_TIMESTAMP      | EVENT_STATUS   | EVENT_NUMBER
    --------------------------------------------------------------------
    A001        | 01/01/2020 09:00:00  | STATUS A       | 1
    A001        | 01/01/2020 10:10:00  | STATUS B       | 2  
    A001        | 01/01/2020 11:20:00  | STATUS C       | 3
    A001        | 01/01/2020 12:30:00  | STATUS C       | 4
    A002        | 01/01/2020 13:40:00  | STATUS F       | 1
    A002        | 01/01/2020 17:50:00  | STATUS F       | 2
    A002        | 01/01/2020 17:53:00  | STATUS G       | 3

我正在纠结的是我要查找的子分组结果(如下),这是否应该与行号相同,但根据事件状态使用分区?我尝试过各种尝试,但是当状态更改时,分区总是重置为1,而不是从1开始,然后随着每次更改而递增?

ACTIVITY_ID | EVENT_TIMESTAMP      | EVENT_STATUS   | EVENT_NUMBER | EVENT_STATUS_GROUP
    ----------------------------------------------------------------------------------------
    A001        | 01/01/2020 09:00:00  | STATUS A       | 1            | 1
    A001        | 01/01/2020 10:10:00  | STATUS B       | 2            | 2
    A001        | 01/01/2020 11:20:00  | STATUS C       | 3            | 3
    A001        | 01/01/2020 12:30:00  | STATUS C       | 4            | 3
    A001        | 01/01/2020 12:30:00  | STATUS A       | 5            | 4

    A002        | 01/01/2020 13:40:00  | STATUS F       | 1            | 1
    A002        | 01/01/2020 17:50:00  | STATUS F       | 2            | 1
    A002        | 01/01/2020 17:53:00  | STATUS G       | 3            | 2

我希望这是足够清楚,如果没有,请不要问任何问题。

xxls0lw8

xxls0lw81#

你可以用 lag() 和一个累计总和来计算变化的次数:

SELECT t.*, 
       ROW_NUMBER() OVER (PARTITION BY ACTIVITY_ID ORDER BY EVENT_TIMESTAMP) AS EVENT_NUMBER,
       SUM(CASE WHEN PREV_EVENT_STATUS = EVENT_STATUS THEN 0 ELSE 1 END) OVER
           (PARTITION BY ACTIVITY_ID ORDER BY EVENT_TIMESTAMP) AS EVENT_STATUS_GROUP
FROM (SELECT t.*,
             LAG(EVENT_STATUS) OVER (PARTITION BY ACTIVITY_ID ORDER BY EVENT_TIMESTAMP) as PREV_EVENT_STATUS
      FROM t
     ) t
ORDER BY ACTIVITY_ID, EVENT_NUMBER ;

这是一把小提琴。

zlhcx6iw

zlhcx6iw2#

你可以用 DENSE_RANK() 分析函数:

SELECT t.*, 
       ROW_NUMBER() OVER (PARTITION BY ACTIVITY_ID ORDER BY EVENT_STATUS ) 
                                                                         AS EVENT_NUMBER,
       DENSE_RANK() OVER (PARTITION BY ACTIVITY_ID ORDER BY EVENT_STATUS ) 
                                                                   AS EVENT_STATUS_GROUP
  FROM tab t
 ORDER BY ACTIVITY_ID, EVENT_NUMBER

演示

kcwpcxri

kcwpcxri3#

您可以使用moder match\u来识别:

--main query:
select 
    ACTIVITY_ID, EVENT_TIMESTAMP, EVENT_STATUS
   ,EVENT_NUMBER
   ,EVENT_STATUS_GROUP
   ,CLS
from (select t.*
            ,row_number()over(partition by ACTIVITY_ID order by EVENT_TIMESTAMP) EVENT_NUMBER 
      from your_tab t
      )
match_recognize(
    partition by ACTIVITY_ID
    order by EVENT_TIMESTAMP
    measures
         MATCH_NUMBER() AS EVENT_STATUS_GROUP,
         case when classifier()='B' then 'DUP' end as cls
    all rows per match
    pattern(A B*)
    define
         b AS b.EVENT_STATUS =PREV(b.EVENT_STATUS)
);

结果:

ACTIVITY_ID   EVENT_TIMESTAMP     EVENT_STATUS  EVENT_NUMBER EVENT_STATUS_GROUP CLS
------------- ------------------- ------------- ------------ ------------------ ---
A001          2020-01-01 09:00:00 STATUS A                 1                  1
A001          2020-01-01 10:10:00 STATUS B                 2                  2
A001          2020-01-01 11:20:00 STATUS C                 3                  3
A001          2020-01-01 12:30:00 STATUS C                 4                  3 DUP
A001          2020-01-01 13:10:00 STATUS D                 5                  4
A002          2020-01-01 13:40:00 STATUS F                 1                  1
A002          2020-01-01 17:50:00 STATUS F                 2                  1 DUP
A002          2020-01-01 17:53:00 STATUS G                 3                  2

8 rows selected.

完整示例(我在您的示例中添加了一行):

-- your sample data:
with your_tab(ACTIVITY_ID, EVENT_TIMESTAMP, EVENT_STATUS) as (
   select 'A001', to_date('01/01/2020 09:00:00','dd/mm/yyyy hh24:mi:ss'),'STATUS A' from dual union all
   select 'A001', to_date('01/01/2020 10:10:00','dd/mm/yyyy hh24:mi:ss'),'STATUS B' from dual union all
   select 'A001', to_date('01/01/2020 11:20:00','dd/mm/yyyy hh24:mi:ss'),'STATUS C' from dual union all
   select 'A001', to_date('01/01/2020 12:30:00','dd/mm/yyyy hh24:mi:ss'),'STATUS C' from dual union all
   select 'A001', to_date('01/01/2020 13:10:00','dd/mm/yyyy hh24:mi:ss'),'STATUS D' from dual union all
   select 'A002', to_date('01/01/2020 13:40:00','dd/mm/yyyy hh24:mi:ss'),'STATUS F' from dual union all
   select 'A002', to_date('01/01/2020 17:50:00','dd/mm/yyyy hh24:mi:ss'),'STATUS F' from dual union all
   select 'A002', to_date('01/01/2020 17:53:00','dd/mm/yyyy hh24:mi:ss'),'STATUS G' from dual
)
--main query:
select 
    ACTIVITY_ID, EVENT_TIMESTAMP, EVENT_STATUS
   ,EVENT_NUMBER
   ,EVENT_STATUS_GROUP
   ,CLS
from (select t.*
            ,row_number()over(partition by ACTIVITY_ID order by EVENT_TIMESTAMP) EVENT_NUMBER 
      from your_tab t
      )
match_recognize(
    partition by ACTIVITY_ID
    order by EVENT_TIMESTAMP
    measures
         MATCH_NUMBER() AS EVENT_STATUS_GROUP,
         case when classifier()='B' then 'DUP' end as cls
    all rows per match
    pattern(A B*)
    define
         b AS b.EVENT_STATUS =PREV(b.EVENT_STATUS)
);

相关问题