sql—每月重置的增量工作日列

bzzcjhmw  于 2021-06-25  发布在  Hive
关注(0)|答案(1)|浏览(325)

我需要创建一个表,其中包含1)一年中所有365天的记录,2)一个计数器,表示一个月中的哪一个工作日。非工作日应以0表示。例如:

Date       |  Business Day
2019-10-01    1
2019-10-02    2
2019-10-03    3
2019-10-04    4
2019-10-05    0    // Saturday
2019-10-06    0    // Sunday     
2019-10-07    5   
....
2019-11-01    1
2019-11-02    0    //  Saturday
2019-11-03    0    //  Sunday
2019-11-04    2

到目前为止,我已经能够创建一个包含一年中所有日期的表。

CREATE TABLE ${TMPID}_days_of_the_year 
(
  `theDate` STRING
);

INSERT OVERWRITE TABLE ${TMPID}_days_of_the_year 
select
    dt_set.theDate
  from
  (
  -- last 0~99 months
    select date_sub('2019-12-31', a.s + 10*b.s + 100*c.s) as theDate
    from
    (
      select 0 as s union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9
    ) a
    cross join
    (
      select 0 as s union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9
    ) b
    cross join
    (
      select 0 as s union all select 1 union all select 2 union all select 3
    ) c
  ) dt_set
  where dt_set.theDate between '2019-01-01' and '2019-12-31'
  order by dt_set.theDate DESC;

我还有一个包含所有周末和假日的表(这个数据是从文件中加载的,日期格式是yyyy-mm-dd)

CREATE TABLE ${TMPID}_company_holiday 
(
  `holidayDate` STRING
) 
;
LOAD DATA LOCAL INPATH '${FILE}' INTO TABLE ${TMPID}_company_holiday;

我的问题是。。。。在创建上面的示例数据中所示的business day counter列时,如何将这些表连接在一起?

zhte4eai

zhte4eai1#

你可以用 row_number() 用于枚举。这有点棘手,因为它需要有条件,但您需要的信息是由 left join :

select dy.*,
       (case when ch.holiday_date is null
             then row_number() over (partition by trunc(dy.date, 'MONTH'), ch.holiday_date
                                     order by dy.date
                                    )
             else 0
        end) as business_day
from days_of_the_year dy left join
     company_holiday ch
     on dy.date = ch.holiday_date;

相关问题