sql缺口/孤岛问题-确定某人是否工作了x年而没有休息y天

acruukt9  于 2021-07-24  发布在  Java
关注(0)|答案(3)|浏览(535)

为日本一家公司解决问题。政府有一些规定,比如。。。如果您持有工作签证:
在一家公司工作三年以上不能不请30天假
你不能在同一家人事公司工作超过5年而不请6个月的假
所以我们想知道在接下来的30/60/90天里是否有人会违反这两条规则。
样本数据(合同清单):

if object_id('tempdb..#sampleDates') is not null drop table #sampleDates
create table #sampleDates (UserId int, CompanyID int, WorkPeriodStart datetime, WorkPeriodEnd datetime)
insert #sampleDates (UserId, CompanyID, WorkPeriodStart, WorkPeriodEnd) values (27809, 972, '2019-10-10', '2020-10-10')
insert #sampleDates (UserId, CompanyID, WorkPeriodStart, WorkPeriodEnd) values (27853, 484, '2019-10-10', '2020-10-10')
insert #sampleDates (UserId, CompanyID, WorkPeriodStart, WorkPeriodEnd) values (27856, 172, '2019-10-10', '2020-10-10')
insert #sampleDates (UserId, CompanyID, WorkPeriodStart, WorkPeriodEnd) values (27857, 1234, '2015-01-01', '2015-12-31')
insert #sampleDates (UserId, CompanyID, WorkPeriodStart, WorkPeriodEnd) values (27857, 1234, '2016-01-01', '2017-02-28')
insert #sampleDates (UserId, CompanyID, WorkPeriodStart, WorkPeriodEnd) values (27857, 1234, '2017-01-01', '2017-12-31')
insert #sampleDates (UserId, CompanyID, WorkPeriodStart, WorkPeriodEnd) values (27857, 1234, '2018-01-01', '2018-12-31')
insert #sampleDates (UserId, CompanyID, WorkPeriodStart, WorkPeriodEnd) values (27857, 1234, '2019-01-01', '2020-01-31')
insert #sampleDates (UserId, CompanyID, WorkPeriodStart, WorkPeriodEnd) values (27857, 1234, '2020-01-01', '2020-12-31')
insert #sampleDates (UserId, CompanyID, WorkPeriodStart, WorkPeriodEnd) values (27897, 179, '2019-10-10', '2020-10-10')

我的第一个问题可能是日期重叠。我已经接近一个解决方案,但直到我知道如何解决x年/y天的工作休假问题,我不知道我的cte或temp表的输出应该是什么样的。
我不希望有人帮我做这项工作,但我想找一篇文章,可以告诉我:
我如何确定某人是否在这段时间内休息过,休息了多长时间(日期范围之间的间隔)?
我怎么知道他们在接下来的30/60/90天里是否已经工作了3/5年而没有30/180天的休息时间?
这看起来很简单,直到我开始编写程序。
谢谢你的帮助。
编辑:
值得一提的是,这是我在消除重叠日期方面的第二次工作尝试(第一个版本使用了密集排序方法,它一直有效,直到我搞砸了一些事情,使用了一些简单的方法):

;with CJ as (
    select UserId, CompanyID, WorkPeriodStart, WorkPeriodEnd from #sampleDates c 
)
select 
       c.CompanyID,
       c.WorkPeriodStart,
       min(t1.WorkPeriodEnd) as EndDate
from CJ c
inner join CJ t1 on c.WorkPeriodStart <= t1.WorkPeriodEnd and c.UserId = t1.UserId and c.CompanyID = t1.CompanyID
    and not exists(select * from CJ t2 where t1.UserId = t2.UserId and t1.CompanyID = t2.CompanyID and t1.WorkPeriodEnd >= t2.WorkPeriodStart AND t1.WorkPeriodEnd < t2.WorkPeriodEnd) 
where not exists(select * from CJ c2 where c.UserId = c2.UserId and c.CompanyID = c2.CompanyID and c.WorkPeriodStart > c2.WorkPeriodStart AND c.WorkPeriodStart <= c2.WorkPeriodEnd) 

group by c.UserId, c.CompanyID, c.WorkPeriodStart 
order by c.UserId, c.WorkPeriodStart
oipij1gg

oipij1gg1#

免责声明:这是一个不完整的答案。
我可以稍后继续,但这展示了如何计算岛屿。那么识别罪犯就不那么复杂了。
参见增强示例。我添加了用户 27897 它有三个岛:0、1和2。见下表:

create table t (UserId int, CompanyID int, WorkPeriodStart date, WorkPeriodEnd date);

insert t (UserId, CompanyID, WorkPeriodStart, WorkPeriodEnd) values
  (27809, 972, '2019-10-10', '2020-10-10'),
  (27853, 484, '2019-10-10', '2020-10-10'),
  (27856, 172, '2019-10-10', '2020-10-10'),
  (27857, 1234, '2015-01-01', '2015-12-31'),
  (27857, 1234, '2016-01-01', '2017-02-28'),
  (27857, 1234, '2017-01-01', '2017-12-31'),
  (27857, 1234, '2018-01-01', '2018-12-31'),
  (27857, 1234, '2019-01-01', '2020-01-31'),
  (27857, 1234, '2020-01-01', '2020-12-31'),
  (27897, 179, '2015-05-28', '2015-09-30'),
  (27897, 179, '2017-03-11', '2017-04-30'),
  (27897, 188, '2017-02-20', '2017-07-07'),
  (27897, 179, '2019-10-10', '2020-10-10');

使用此数据,为每行计算孤岛的查询可以如下所示:

select *,
  sum(hop) over(partition by UserId order by WorkPeriodStart) as island
from (
  select *,
    case when WorkPeriodStart > dateadd(day, 1, max(WorkPeriodEnd) 
      over(partition by UserId 
           order by WorkPeriodStart 
           rows between unbounded preceding and 1 preceding))
         then 1 else 0 end as hop
  from t
) x
order by UserId, WorkPeriodStart

结果:

UserId  CompanyID  WorkPeriodStart  WorkPeriodEnd  hop  island
------  ---------  ---------------  -------------  ---  ------
 27809        972  2019-10-10       2020-10-10       0       0
 27853        484  2019-10-10       2020-10-10       0       0
 27856        172  2019-10-10       2020-10-10       0       0
 27857       1234  2015-01-01       2015-12-31       0       0
 27857       1234  2016-01-01       2017-02-28       0       0
 27857       1234  2017-01-01       2017-12-31       0       0
 27857       1234  2018-01-01       2018-12-31       0       0
 27857       1234  2019-01-01       2020-01-31       0       0
 27857       1234  2020-01-01       2020-12-31       0       0
 27897        179  2015-05-28       2015-09-30       0       0
 27897        188  2017-02-20       2017-07-07       1       1
 27897        179  2017-03-11       2017-04-30       0       1
 27897        179  2019-10-10       2020-10-10       1       2

现在,我们可以通过执行以下操作来扩充此查询,以获取每个岛的“工作日”和每个岛之前的“休假日”:

select *,
  datediff(day, s, e) + 1 as worked,
  datediff(day, lag(e) over(partition by UserId order by island), s) as prev_days_off
from (
  select UserId, island, min(WorkPeriodStart) as s, max(WorkPeriodEnd) as e
  from (
    select *,
      sum(hop) over(partition by UserId order by WorkPeriodStart) as island
    from (
      select *,
        case when WorkPeriodStart > dateadd(day, 1, max(WorkPeriodEnd) 
          over(partition by UserId 
               order by WorkPeriodStart 
               rows between unbounded preceding and 1 preceding))
             then 1 else 0 end as hop
      from t
    ) x
  ) y
  group by UserId, island
) x
order by UserId, island

结果:

UserId  island  s           e           worked  prev_days_off
------  ------  ----------  ----------  ------  -------------
 27809       0  2019-10-10  2020-10-10     367         <null>
 27853       0  2019-10-10  2020-10-10     367         <null>
 27856       0  2019-10-10  2020-10-10     367         <null>
 27857       0  2015-01-01  2020-12-31    2192         <null>
 27897       0  2015-05-28  2015-09-30     126         <null>
 27897       1  2017-02-20  2017-07-07     138            509
 27897       2  2019-10-10  2020-10-10     367            825

这个结果非常接近你所需要的。这些数据实际上对根据条件筛选行很有用。

qvtsj1bj

qvtsj1bj2#

此脚本合并任何重叠的工作期间,然后计算前3年和5年期间的总工作天数。然后获取该值并确定该值是否超过该期间内允许的最大工作日 UserId 以及 CompanyId 三年的期限 UserId 5年期限(这是对你问题中规则的正确解释吗?)
从这一点,然后简单地增加 30 , 60 以及 90 总天数,看看更大的价值是否会超过各自的限制。考虑到不同的分组规则,这将作为2个查询(没有重复的 UserId 但结果仍然是反对任何违规行为的标志 UserId .
在下面的示例中,您可以看到 UserId = 27857 目前只违反了5年的规定,然后又违反了3年的规定,他们应该再留任一年吗 60 天。此外, UserId = 27858 目前还可以,但将违反5年规则 60 天。
我对你如何定义一年以及你的工作是否 WorkPeriodEnd 值是否包含,因此请检查所需的逻辑是否正确应用。

脚本

if object_id('tempdb..#sampleDates') is not null drop table #sampleDates
create table #sampleDates (UserId int, CompanyId int, WorkPeriodStart datetime, WorkPeriodEnd datetime)
insert #sampleDates values
 (27809, 972, '2019-10-10', '2020-10-10')
,(27853, 484, '2019-10-10', '2020-10-10')
,(27856, 172, '2019-10-10', '2020-10-10')

,(27857, 1234, '2015-01-01', '2015-12-31')
,(27857, 1234, '2016-01-01', '2017-02-28')
,(27857, 1234, '2017-01-01', '2017-12-31')
,(27857, 1234, '2018-01-01', '2018-12-31')
,(27857, 1234, '2019-01-01', '2020-01-31')
,(27857, 1234, '2020-01-01', '2020-05-31')

,(27858, 1234, '2015-01-01', '2015-12-31')
,(27858, 1234, '2016-01-01', '2017-02-28')
,(27858, 1234, '2017-01-01', '2017-12-31')
,(27858, 1234, '2018-01-01', '2018-12-31')
,(27858, 1234, '2019-09-01', '2020-01-31')
,(27858, 1234, '2020-01-01', '2020-08-31')

,(27859, 12345, '2015-01-01', '2015-12-31')
,(27859, 12346, '2016-01-01', '2017-02-28')
,(27859, 12347, '2017-01-01', '2017-12-31')
,(27859, 12348, '2018-01-01', '2018-12-31')
,(27859, 12349, '2019-01-01', '2020-01-31')
,(27859, 12340, '2020-01-01', '2020-12-31')

,(27897, 179, '2019-10-10', '2020-10-10')
;

declare @3YearsAgo date = dateadd(year,-3,getdate());
declare @3YearWorkingDays int = (365*3)-30;

declare @5YearsAgo date = dateadd(year,-5,getdate());
declare @5YearWorkingDays int = (365*5)-(365/2);

with p as
(
    select UserId
          ,CompanyId
          ,min(WorkPeriodStart) as WorkPeriodStart
          ,max(WorkPeriodEnd) as WorkPeriodEnd
    from(select l.*,
                sum(case when dateadd(day,1,l.PrevEnd) < l.WorkPeriodStart then 1 else 0 end) over (partition by l.UserId, l.CompanyId order by l.WorkPeriodStart rows unbounded preceding) as grp
        from(select d.*,
                    lag(d.WorkPeriodEnd) over (partition by d.UserId, d.CompanyId order by d.WorkPeriodEnd) as PrevEnd
            from #sampleDates as d
            ) as l
        ) as g
    group by grp
            ,UserId
            ,CompanyId
)
,d as
(
    select UserId
          ,CompanyId
          ,sum(case when @3YearsAgo < WorkPeriodEnd
                    then datediff(day
                                 ,case when @3YearsAgo between WorkPeriodStart and WorkPeriodEnd then @3YearsAgo else WorkPeriodStart end
                                 ,WorkPeriodEnd
                                 )
                    else 0
                    end
              ) as WorkingDays3YearsToToday

          ,sum(case when @5YearsAgo < WorkPeriodEnd
                    then datediff(day
                                 ,case when @5YearsAgo between WorkPeriodStart and WorkPeriodEnd then @5YearsAgo else WorkPeriodStart end
                                 ,WorkPeriodEnd
                                 )
                    else 0
                    end
               ) as WorkingDays5YearsToToday
    from p
    group by UserId
            ,CompanyId
)
select UserId
     ,CompanyId
     ,@3YearWorkingDays as Limit3Year
     ,@5YearWorkingDays as Limit5Year
     ,WorkingDays3YearsToToday
     ,WorkingDays5YearsToToday

     ,case when WorkingDays3YearsToToday > @3YearWorkingDays then 1 else 0 end as Violation3YearNow
     ,case when sum(WorkingDays5YearsToToday) over (partition by UserId) > @5YearWorkingDays then 1 else 0 end as Violation5YearNow

     ,case when WorkingDays3YearsToToday + 30 > @3YearWorkingDays then 1 else 0 end as Violation3Year30Day
     ,case when sum(WorkingDays5YearsToToday) over (partition by UserId) + 30 > @5YearWorkingDays then 1 else 0 end as Violation5Year30Day

     ,case when WorkingDays3YearsToToday + 60 > @3YearWorkingDays then 1 else 0 end as Violation3Year60Day
     ,case when sum(WorkingDays5YearsToToday) over (partition by UserId) + 60 > @5YearWorkingDays then 1 else 0 end as Violation5Year60Day

     ,case when WorkingDays3YearsToToday + 90 > @3YearWorkingDays then 1 else 0 end as Violation3Year90Day
     ,case when sum(WorkingDays5YearsToToday) over (partition by UserId) + 90 > @5YearWorkingDays then 1 else 0 end as Violation5Year90Day
from d
order by UserId
        ,CompanyId;

输出

+--------+-----------+------------+------------+--------------------------+--------------------------+-------------------+-------------------+---------------------+---------------------+---------------------+---------------------+---------------------+---------------------+
| UserId | CompanyId | Limit3Year | Limit5Year | WorkingDays3YearsToToday | WorkingDays5YearsToToday | Violation3YearNow | Violation5YearNow | Violation3Year30Day | Violation5Year30Day | Violation3Year60Day | Violation5Year60Day | Violation3Year90Day | Violation5Year90Day |
+--------+-----------+------------+------------+--------------------------+--------------------------+-------------------+-------------------+---------------------+---------------------+---------------------+---------------------+---------------------+---------------------+
|  27809 |       972 |       1065 |       1643 |                      366 |                      366 |                 0 |                 0 |                   0 |                   0 |                   0 |                   0 |                   0 |                   0 |
|  27853 |       484 |       1065 |       1643 |                      366 |                      366 |                 0 |                 0 |                   0 |                   0 |                   0 |                   0 |                   0 |                   0 |
|  27856 |       172 |       1065 |       1643 |                      366 |                      366 |                 0 |                 0 |                   0 |                   0 |                   0 |                   0 |                   0 |                   0 |
|  27857 |      1234 |       1065 |       1643 |                     1029 |                     1760 |                 0 |                 1 |                   0 |                   1 |                   1 |                   1 |                   1 |                   1 |
|  27858 |      1234 |       1065 |       1643 |                      877 |                     1608 |                 0 |                 0 |                   0 |                   0 |                   0 |                   1 |                   0 |                   1 |
|  27859 |     12340 |       1065 |       1643 |                      365 |                      365 |                 0 |                 1 |                   0 |                   1 |                   0 |                   1 |                   0 |                   1 |
|  27859 |     12345 |       1065 |       1643 |                        0 |                      147 |                 0 |                 1 |                   0 |                   1 |                   0 |                   1 |                   0 |                   1 |
|  27859 |     12346 |       1065 |       1643 |                        0 |                      424 |                 0 |                 1 |                   0 |                   1 |                   0 |                   1 |                   0 |                   1 |
|  27859 |     12347 |       1065 |       1643 |                      147 |                      364 |                 0 |                 1 |                   0 |                   1 |                   0 |                   1 |                   0 |                   1 |
|  27859 |     12348 |       1065 |       1643 |                      364 |                      364 |                 0 |                 1 |                   0 |                   1 |                   0 |                   1 |                   0 |                   1 |
|  27859 |     12349 |       1065 |       1643 |                      395 |                      395 |                 0 |                 1 |                   0 |                   1 |                   0 |                   1 |                   0 |                   1 |
|  27897 |       179 |       1065 |       1643 |                      366 |                      366 |                 0 |                 0 |                   0 |                   0 |                   0 |                   0 |                   0 |                   0 |
+--------+-----------+------------+------------+--------------------------+--------------------------+-------------------+-------------------+---------------------+---------------------+---------------------+---------------------+---------------------+---------------------+
bvk5enib

bvk5enib3#

这就是我的结局。
<无用的计划>
我一直面临的问题是:
我如何处理任何和所有日期范围重叠,并确定合同日期范围内的天数。
客户端仍在使用SQL2008,所以我需要一些旧的(er)学校tsql。
确保准确计算中断时间(合同之间的时间)。
所以我选择了我自己的解决方案,这可能是愚蠢的,因为它需要在内存中为每个工作日/候选组合生成一个记录。我看合同表没有超出5-10公里的记录范围。我要往这个方向走的唯一原因。
我创建了一个日历表,其中包含从1980年1月1日到2050年12月31日的所有日期,然后通过candidateid将合同范围加入日历表。这些将是工作日期。日历表中与合同范围内的日期不匹配的任何日期都是休息日。

日历表

if object_id('CalendarTable') is not null drop table CalendarTable
go

create table CalendarTable (pk int identity, CalendarDate date )

declare @StartDate date = cast('1980-01-01' as date)
declare @EndDate date = cast('2050-12-31' as date)
while @StartDate <= @EndDate
begin
    insert into CalendarTable ( CalendarDate ) values ( @StartDate )
    set @StartDate = dateadd(dd, 1, @StartDate)
end
go

5年违规查询(工作5年,没有6个月的冷静期)

declare @enddate date = dateadd(dd, 30, getdate()) 
declare @beginDate date = dateadd(dd, -180, dateadd(year, -5, getdate()))

select poss.CandidateId, 
min(work.CalendarDate) as FirstWorkDate, 
count(work.CandidateId) as workedDays, 
sum(case when work.CandidateId is null then 1 else 0 end) as breakDays, 
case when count(work.CandidateId) > (365*5) and sum(case when work.CandidateId is null then 1 else 0 end) < (365/2) then 1 else 0 end as Year5Violation,
case when count(work.CandidateId) > (365*5) and sum(case when work.CandidateId is null then 1 else 0 end) < (365/2) then DATEADD(year, 5, min(work.CalendarDate)) else null end as ViolationDate
from 
(
    select cand.CandidateId, cal.CalendarDate
    from CalendarTable cal
    join (select distinct c.CandidateId from contracts c where c.WorkPeriodStart is not null and c.WorkPeriodEnd is not null and c.Deleted = 0) cand on 1 = 1
    where cal.CalendarDate between @beginDate and @enddate
) as poss 
left join 
(
    select distinct c.CandidateId, cal.CalendarDate
    from contracts c
        join CalendarTable cal on cal.CalendarDate between c.WorkPeriodStart and c.WorkPeriodEnd 
    where c.WorkPeriodStart is not null and c.WorkPeriodEnd is not null and c.Deleted = 0
) as work on work.CandidateId = poss.CandidateId and work.CalendarDate = poss.CalendarDate
group by poss.CandidateId

相关问题