oracle 员工人数超过100人的连续行

xmakbtuz  于 2023-05-06  发布在  Oracle
关注(0)|答案(3)|浏览(127)

写一个oracle sql查询来获得三个或更多个雇员数量〉100的连续id。例如,在下面的数据中,我们需要id为-5,6,7,8的行。ID为5、6、7、8的行是连续的,并且具有〉100个total_employees。但是,即使total_employees〉100,也不会选择id 10或12,因为它们不是连续的。
输入数据

create table employee(id integer, enroll_date date, total_employees integer);

insert into employee values (1,to_date('01-04-2023','DD-MM-YYYY'),10);
insert into employee values (2,to_date('02-04-2023','DD-MM-YYYY'),109);
insert into employee values (3,to_date('03-04-2023','DD-MM-YYYY'),150);
insert into employee values (4,to_date('04-04-2023','DD-MM-YYYY'),99);
insert into employee values (5,to_date('05-04-2023','DD-MM-YYYY'),145);
insert into employee values (6,to_date('06-04-2023','DD-MM-YYYY'),1455);
insert into employee values (7,to_date('07-04-2023','DD-MM-YYYY'),199);
insert into employee values (8,to_date('08-04-2023','DD-MM-YYYY'),188);
insert into employee values (10,to_date('10-04-2023','DD-MM-YYYY'),188);
insert into employee values (12,to_date('12-04-2023','DD-MM-YYYY'),10);
insert into employee values (13,to_date('13-04-2023','DD-MM-YYYY'),200);

我试图在下面尝试获得连续的组和员工计数标志,但没有弄清楚以获得预期的结果。

select id, enroll_date,total_employees,
        case when total_employees>100 then 1 else 0 end emp_flag,
        SUM(case when total_employees>100 then 1 else 0 end) OVER (ORDER BY id) AS grp,
        id - row_number() over(order by id) as diff, -- group consecutive id's
        ROW_NUMBER() OVER (PARTITION BY CASE WHEN total_employees > 100
                           THEN 1 ELSE 0 END ORDER BY enroll_date) as sal_rn,
        id - ROW_NUMBER() OVER (PARTITION BY CASE WHEN total_employees > 100
                           THEN 1 ELSE 0 END ORDER BY enroll_date) AS sal_grp
from employee
;
dzjeubhm

dzjeubhm1#

这里有一个选择读取代码中的注解。

SQL> with temp as
  2    -- eliminate rows by number of total employees and create groups (islands)
  3    (select b.*,
  4       row_number() over (order by id) - id grp
  5     from employee b
  6     where total_employees > 100
  7    ),
  8  temp2 as
  9    -- count number of members per group
 10    (select a.*,
 11       count(*) over (partition by grp) cnt
 12     from temp a
 13    )
 14  -- finally, extract rows that have 3 or more consecutive rows in the same group
 15  select id, enroll_date, total_employees
 16  from temp2
 17  where cnt >= 3
 18  order by id;

        ID ENROLL_DAT TOTAL_EMPLOYEES
---------- ---------- ---------------
         5 05-04-2023             145
         6 06-04-2023            1455
         7 07-04-2023             199
         8 08-04-2023             188

SQL>
tuwxkamq

tuwxkamq2#

在Oracle 12中,可以使用MATCH_RECOGNIZE执行逐行模式匹配:

SELECT *
FROM   employee
MATCH_RECOGNIZE(
  ORDER BY enroll_date
  ALL ROWS PER MATCH
  PATTERN ( consecutive{2,} last_row )
  DEFINE
    consecutive AS  id + 1 = NEXT(id)
                AND total_employees > 100,
    last_row    AS  total_employees > 100
)

或者,首先对total_employees进行滤波(而不是在模式匹配期间):

SELECT *
FROM   (SELECT * FROM employee WHERE total_employees > 100)
MATCH_RECOGNIZE(
  ORDER BY enroll_date
  ALL ROWS PER MATCH
  PATTERN ( consecutive{2,} last_row )
  DEFINE consecutive AS  id + 1 = NEXT(id)
)

对于样本数据,两者都输出:
| 注册日期|ID|员工总数|
| --------------|--------------|--------------|
| 2019 -04- 15 00:00:00|五|一百四十五|
| 2023年4月6日00时00分|六|一四五五|
| 2019 -04-07 00:00:00|七|一百九十九|
| 2019 -04-08 00:00:00|八|一百八十八|
fiddle

ss2ws0br

ss2ws0br3#

可以使用窗口函数来解决这个间隙和孤岛问题。下面是一个使用lag()和窗口count()的方法:
逻辑是将每个id与“前一个”id的值进行比较,并计算它改变了多少次。

with cte as (
  select id, enroll_date, TOTAL_EMPLOYEES, lag(id) over (order by id) as prev_sal
  from employee
  where TOTAL_EMPLOYEES > 100
),
cte2 as (
  select id, enroll_date, TOTAL_EMPLOYEES, id - sum(case when id - PREV_SAL = 1 then 1 else 0 end) over(order by id) as grp
  from cte
)
select id, enroll_date, TOTAL_EMPLOYEES
from cte2 t
where grp in (
  select grp
  from cte2
  group by grp 
  having count(1) >= 3
)
)

结果:

ID  ENROLL_DATE TOTAL_EMPLOYEES
5   05-APR-23   145
6   06-APR-23   1455
7   07-APR-23   199
8   08-APR-23   188

Demo here

相关问题