oracle按日期范围查询,但如果需要,可以获取此范围之外的后续记录

zvms9eto  于 2021-07-26  发布在  Java
关注(0)|答案(5)|浏览(254)

我有一个db表的员工时钟数据,我想每天查询。但是,当员工的轮班时间跨越午夜时,查询将错过确定该员工的总工时所需的事务。

ID NUMBER (PK)
EmpId VARCHAR
TransActType VARCHAR
Start TIMESTAMP
Account VARCHAR

1   EmpA ClockIn  7/7/20 8am  Account1
2   EmpB ClockIn  7/7/20 9am  Account7
3   EmpC ClockIn  7/7/20 9am  Account1
4   EmpA Switch   7/7/20 10am Account3
5   EmpA Switch   7/7/20 11am Account6
6   EmpC Switch   7/7/20 1pm  Account4
7   EmpC ClockOut 7/7/20 3pm 
8   EmpD ClockIn  7/7/20 5pm  Account5
9   EmpD Switch   7/7/20 6pm  Account6
10  EmpB ClockOut 7/7/20 6pm
11  EmpD Switch   7/7/20 7pm  Account4
12  EmpA ClockOut 7/7/20 8pm
13  EmpD Switch   7/8/20 1am  Account3
14  EmpD ClockOut 7/8/20 2am
15  EmpA ClockIn  7/8/20 8am  Account1
...

我对7/7/20的人工费的要求是

SELECT * FROM labor li where li.start between 7/7/20 12 am and 7/7/20 11:59 pm order by empId, start

只会拿到1-12的记录,但也应该拿到13和14的记录。
我的应用程序代码通过循环查询结果和区分连续员工事务的开始时间来计算每个员工在一天中收取的帐户持续时间。
没有记录13和14,我无法确定员工d的account4和account3持续时间。
也许我的数据库设计有缺陷,因为我只存储事务的开始时间,然后使用应用程序代码逻辑计算持续时间。我决定这样设计,这样如果记录被移动/插入/删除,那么如果结束时间也被持久化,我就不会有太多的机会出现个别持久化的员工人工项重叠。上面的人工数据示例显示,员工可以全天更改帐户,并且每天有多个人工事务。
我希望有一个查询可以在每个员工的基础上提前查看,如果该员工的最后一个事务不是时间戳范围内的“打卡”,则继续抓取记录,直到找到一个为止。
同样地,我不能让那个雇员的第一笔交易是前一天的打卡。
如果这些查询几乎是不可能的,数据库的设计应该改变,那么我想知道。

lztngnrs

lztngnrs1#

I am hoping for a query that could look ahead in time on a per employee basis and if the last transaction of that employee is not a "ClockOut" for the timestamp range, keep grabbing records until one is found. ```
with
dates(d1, d2) as (select date '2020-07-07', date '2020-07-08' from dual),
main as (select id, empid, transacttype, start_, account,
max(transacttype) keep (dense_rank last order by start_)
over (partition by empid) mtt
from li join dates on d1 <= start_ and start_ < d2),
miss as (select empid, max(id) mnid from main where mtt = 'ClockIn' group by empid),
cout as (select empid, min(id) mxid
from li join dates on start_ >= d2 join miss using (empid)
where transacttype = 'ClockOut'
group by empid)
select id, empid, transacttype, start_, account from main union all
select li.id, li.empid, li.transacttype, li.start_, li.account from li
join miss on li.empid = miss.empid and li.id > mnid
join cout on li.empid = cout.empid and li.id < mxid
order by empid, start_

D小提琴
这个查询就像你说的那样。我查找该时段的所有数据,还查找transactictype的最后一个值。如果不是的话 `ClockOut` 接下来我要寻找 `min(id)` 第二天失踪的员工。最后结合主要数据和缺失。
小心点,因为在你的例子中有时是这样的 `Clockout` ,有时 `ClockOut` . 你可以用 `upper()` 或者在你的真实数据中发生了什么。
如果你不想依赖 `id` ,但可以使用上述查询的日期版本,而不是日期:

with
dates(d1, d2) as (select date '2020-07-07', date '2020-07-08' from dual),
main as (select id, empid, transacttype, start_, account,
max(transacttype) keep (dense_rank last order by start_)
over (partition by empid) mtt
from li join dates on d1 <= start_ and start_ < d2),
miss as (select empid, max(start_) mnst from main where mtt = 'ClockIn' group by empid),
cout as (select empid, min(start_) mxst
from li join dates on start_ >= d2 join miss using (empid)
where transacttype = 'ClockOut'
group by empid)
select id, empid, transacttype, start_, account from main union all
select li.id, li.empid, li.transacttype, li.start_, li.account from li
join dates on start_ >= d2
join cout on li.empid = cout.empid and li.start_ <= mxst
order by empid, start_

D小提琴
mkh04yzy

mkh04yzy2#

如果您使用的是最新版本的oracle,则可以使用 match_recognize() 要跟踪时钟输入/切换/时钟输出,然后根据时钟输入时间进行过滤:

select id, empid, transacttype, start_time, account
from labor
match_recognize (
  partition by empid
  order by start_time
  measures
    first(start_time) as grp_clockin
  all rows per match
  after match skip past last row
  pattern (clockin switch* clockout*)
  define
    clockin as clockin.transacttype = 'ClockIn',
    switch as switch.transacttype = 'Switch',
    clockout as clockout.transacttype = 'ClockOut'
)
where grp_clockin >= date '2020-07-07'
and grp_clockin < date '2020-07-08'
order by empid, grp_clockin, start_time;
ID | EMPID | TRANSACTTYPE | START_TIME          | ACCOUNT 
-: | :---- | :----------- | :------------------ | :-------
 1 | EmpA  | ClockIn      | 2020-07-07 08:00:00 | Account1
 4 | EmpA  | Switch       | 2020-07-07 10:00:00 | Account3
 5 | EmpA  | Switch       | 2020-07-07 11:00:00 | Account6
12 | EmpA  | ClockOut     | 2020-07-07 20:00:00 | null    
 2 | EmpB  | ClockIn      | 2020-07-07 09:00:00 | Account7
10 | EmpB  | ClockOut     | 2020-07-07 18:00:00 | null    
 3 | EmpC  | ClockIn      | 2020-07-07 09:00:00 | Account1
 6 | EmpC  | Switch       | 2020-07-07 13:00:00 | Account4
 7 | EmpC  | ClockOut     | 2020-07-07 15:00:00 | null    
 8 | EmpD  | ClockIn      | 2020-07-07 17:00:00 | Account5
 9 | EmpD  | Switch       | 2020-07-07 18:00:00 | Account6
11 | EmpD  | Switch       | 2020-07-07 19:00:00 | Account4
13 | EmpD  | Switch       | 2020-07-08 01:00:00 | Account3
14 | EmpD  | ClockOut     | 2020-07-08 02:00:00 | null

由于过滤器发生得晚,您可以使用内联视图将其缩小,至少缩小到最小日期/时间:

select id, empid, transacttype, start_time, account
from (
  select *
  from labor
  where start_time >= date '2020-07-07'
)
match_recognize (
  partition by empid
  order by start_time
  measures
    first(start_time) as grp_clockin
  all rows per match
  after match skip past last row
  pattern (clockin switch* clockout*)
  define
    clockin as clockin.transacttype = 'ClockIn',
    switch as switch.transacttype = 'Switch',
    clockout as clockout.transacttype = 'ClockOut'
)
where grp_clockin >= date '2020-07-07'
and grp_clockin < date '2020-07-08'
order by empid, grp_clockin, start_time;

db<>小提琴
如果你能找到一个合理的范围,你也可以选择一个最大范围,比如说排除一天之后的任何东西:

select id, empid, transacttype, start_time, account
from (
  select *
  from labor
  where start_time >= date '2020-07-07'
  and start_time < date '2020-07-09'
)
...

我的应用程序代码通过循环查询结果和区分连续员工事务的开始时间来计算每个员工在一天中收取的帐户持续时间
如果我知道你在做什么,你也可以在查询中做所有这些:

select empid, grp_start_time, grp_end_time, grp_account,
  (grp_end_time - grp_start_time) * interval '1' day as elapsed
from labor
match_recognize (
  partition by empid
  order by start_time
  measures
    first(start_time) as grp_start_time,
    first(account) as grp_account,
    final last(start_time) as grp_end_time
  one row per match
  after match skip to last grp_end
  pattern (grp_start grp_end)
  define
    grp_start as grp_start.transacttype in ('ClockIn', 'Switch'),
    grp_end as grp_end.transacttype in ('Switch', 'ClockOut')
)
where grp_start_time >= date '2020-07-07'
and grp_start_time < date '2020-07-08'
order by empid, grp_start_time;
EMPID | GRP_START_TIME      | GRP_END_TIME        | GRP_ACCOUNT | ELAPSED                      
:---- | :------------------ | :------------------ | :---------- | :----------------------------
EmpA  | 2020-07-07 08:00:00 | 2020-07-07 10:00:00 | Account1    | +000000000 02:00:00.000000000
EmpA  | 2020-07-07 10:00:00 | 2020-07-07 11:00:00 | Account3    | +000000000 01:00:00.000000000
EmpA  | 2020-07-07 11:00:00 | 2020-07-07 20:00:00 | Account6    | +000000000 09:00:00.000000000
EmpB  | 2020-07-07 09:00:00 | 2020-07-07 18:00:00 | Account7    | +000000000 09:00:00.000000000
EmpC  | 2020-07-07 09:00:00 | 2020-07-07 13:00:00 | Account1    | +000000000 04:00:00.000000000
EmpC  | 2020-07-07 13:00:00 | 2020-07-07 15:00:00 | Account4    | +000000000 02:00:00.000000000
EmpD  | 2020-07-07 17:00:00 | 2020-07-07 18:00:00 | Account5    | +000000000 01:00:00.000000000
EmpD  | 2020-07-07 18:00:00 | 2020-07-07 19:00:00 | Account6    | +000000000 01:00:00.000000000
EmpD  | 2020-07-07 19:00:00 | 2020-07-08 01:00:00 | Account4    | +000000000 06:00:00.000000000

db<>小提琴

2g32fytz

2g32fytz3#

听起来你想抵消这些天,所以从凌晨2点到8点之间的某个时间点开始,不知道具体在哪里,所以我们假设是凌晨5点。
只需减去5小时的日期比较。或者:

where li.start - interval '5' hour >= date '2020-07-07' and 
      li.start - interval '5' hour < date '2020-07-08'

或者,以更方便索引的方式:

where li.start >= date '2020-07-07' + interval '5' hour 
      li.start < date '2020-07-08' + interval '5' hour
bpzcxfmw

bpzcxfmw4#

我想这正是你想要的:sqlfiddle
我不确定这是最好的解决方案,但也许你可以用它来构建你的实际解决方案。可能思考stibbons的答案会比这个更好,如果适应从评论中澄清的要求(“所以不,如果当天是7/8/20,我不想接13和14”)。

WITH
erange AS
(
 SELECT lii.EmpId, lii."Start", MIN(lio."Start") as "End"
 FROM labor lii INNER JoIN labor lio
  ON lii.EmpId = lio.EmpId
 WHERE lii.TransActType='ClockIn'
  AND lio.TransActType='ClockOut'
  AND lii."Start" < lio."Start"
 GROUP BY lii.EmpId, lii."Start"
)
SELECT li.*
FROM labor li INNER JOIN erange er
 ON li.EmpId = er.EmpId AND li."Start" between er."Start" and er."End"
WHERE er."Start" between date '2020-07-07' and date '2020-07-08'
ORDER BY li.EmpId, li."Start"
vuktfyat

vuktfyat5#

您可以使用简单的分析函数“first\u value”来表示:

select li.*
  , first_value(decode(TransActType,'ClockOut',"START") ignore nulls)
       over(partition by EmpId order by "START" rows between current row and unbounded following) ClockOut
from labor li;

如您所见,它添加了新的列clockout,其中包含下一次clockout的时间,因此您可以轻松地将所需的数据添加到查询中。
完整测试用例:

-- sample data:
with labor(ID,EmpId,TransActType,"START",Account) as (
   select 1   ,'EmpA', 'ClockIn ', to_timestamp('7/7/20 8am ','mm/dd/yy hhAM'),'Account1' from dual union all 
   select 2   ,'EmpB', 'ClockIn ', to_timestamp('7/7/20 9am ','mm/dd/yy hhAM'),'Account7' from dual union all 
   select 3   ,'EmpC', 'ClockIn ', to_timestamp('7/7/20 9am ','mm/dd/yy hhAM'),'Account1' from dual union all 
   select 4   ,'EmpA', 'Switch  ', to_timestamp('7/7/20 10am','mm/dd/yy hhAM'),'Account3' from dual union all 
   select 5   ,'EmpA', 'Switch  ', to_timestamp('7/7/20 11am','mm/dd/yy hhAM'),'Account6' from dual union all 
   select 6   ,'EmpC', 'Switch  ', to_timestamp('7/7/20 1pm ','mm/dd/yy hhAM'),'Account4' from dual union all 
   select 7   ,'EmpC', 'ClockOut', to_timestamp('7/7/20 3pm ','mm/dd/yy hhAM'),'        ' from dual union all 
   select 8   ,'EmpD', 'ClockIn ', to_timestamp('7/7/20 5pm ','mm/dd/yy hhAM'),'Account5' from dual union all 
   select 9   ,'EmpD', 'Switch  ', to_timestamp('7/7/20 6pm ','mm/dd/yy hhAM'),'Account6' from dual union all 
   select 10  ,'EmpB', 'ClockOut', to_timestamp('7/7/20 6pm ','mm/dd/yy hhAM'),'        ' from dual union all 
   select 11  ,'EmpD', 'Switch  ', to_timestamp('7/7/20 7pm ','mm/dd/yy hhAM'),'Account4' from dual union all 
   select 12  ,'EmpA', 'ClockOut', to_timestamp('7/7/20 8pm ','mm/dd/yy hhAM'),'        ' from dual union all 
   select 13  ,'EmpD', 'Switch  ', to_timestamp('7/8/20 1am ','mm/dd/yy hhAM'),'Account3' from dual union all 
   select 14  ,'EmpD', 'ClockOut', to_timestamp('7/8/20 2am ','mm/dd/yy hhAM'),'        ' from dual union all 
   select 15  ,'EmpA', 'ClockIn ', to_timestamp('7/8/20 8am ','mm/dd/yy hhAM'),'Account1' from dual 
)
--main query:
select li.*
  , first_value(decode(TransActType,'ClockOut',"START") ignore nulls)
       over(partition by EmpId order by "START" rows between current row and unbounded following) ClockOut
from labor li;

结果

|   ID | EMPI | TRANSACT | START               | ACCOUNT  | CLOCKOUT
| ---- | ---- | -------- | ------------------- | -------- | -------------------
|    1 | EmpA | ClockIn  | 2020-07-07 08:00:00 | Account1 | 2020-07-07 20:00:00
|    4 | EmpA | Switch   | 2020-07-07 10:00:00 | Account3 | 2020-07-07 20:00:00
|    5 | EmpA | Switch   | 2020-07-07 11:00:00 | Account6 | 2020-07-07 20:00:00
|   12 | EmpA | ClockOut | 2020-07-07 20:00:00 |          | 2020-07-07 20:00:00
|   15 | EmpA | ClockIn  | 2020-07-08 08:00:00 | Account1 | 
|    2 | EmpB | ClockIn  | 2020-07-07 09:00:00 | Account7 | 2020-07-07 18:00:00
|   10 | EmpB | ClockOut | 2020-07-07 18:00:00 |          | 2020-07-07 18:00:00
|    3 | EmpC | ClockIn  | 2020-07-07 09:00:00 | Account1 | 2020-07-07 15:00:00
|    6 | EmpC | Switch   | 2020-07-07 13:00:00 | Account4 | 2020-07-07 15:00:00
|    7 | EmpC | ClockOut | 2020-07-07 15:00:00 |          | 2020-07-07 15:00:00
|    8 | EmpD | ClockIn  | 2020-07-07 17:00:00 | Account5 | 2020-07-08 02:00:00
|    9 | EmpD | Switch   | 2020-07-07 18:00:00 | Account6 | 2020-07-08 02:00:00
|   11 | EmpD | Switch   | 2020-07-07 19:00:00 | Account4 | 2020-07-08 02:00:00
|   13 | EmpD | Switch   | 2020-07-08 01:00:00 | Account3 | 2020-07-08 02:00:00
|   14 | EmpD | ClockOut | 2020-07-08 02:00:00 |          | 2020-07-08 02:00:00

相关问题