SQL Server Select the first and last event per year

cbjzeqam  于 12个月前  发布在  其他
关注(0)|答案(2)|浏览(102)

I have a dataset with a bunch of event data. Some years have one event some have multiple.

I'm trying to only give the first and last event for a particular year.

This is what I have so far:

WITH yearnum (EID, ED, YR, EN, EDS, CID, COD) AS 
(
    SELECT
        eventid, MIN(eventdate) AS minimum_event_date,
        DATEPART(year, EVENTdate) AS year_in_question,
        EventName, EventDetails, CategoryID, CountryID
    FROM
        tblevent 
    GROUP BY
        eventid, EventDate, EventName, EventDetails, CategoryID, CountryID
),
rownumS (EID, ED, YR, EN, EDS, CID, COD, RN) AS
(
    SELECT
        EID, ED, YR, EN, EDS, CID, COD, 
        ROW_NUMBER() OVER (PARTITION BY YR ORDER BY YR ASC) AS rownumyear
    FROM
        yearnum
)
SELECT 
    EID, ED, YR, EN, EDS, CID, COD, RN 
FROM 
    rownums

Result:

Output

8wigbo56

8wigbo561#

Break the problem down into pieces you can process.

  • get a list of events with min date per year/id.
  • get a list of events with max date per year/id.
  • union list for inclusion.
  • Select from from base set only those in our include list.

This may not be the most efficient way alternatives include

  • Use Row_nubmer to get min/max then exlcude (example provided)
  • use table value function cross apply to get min/max per year & event (Example NOT provided)

Assumptions:

  • The PK for tblEvents is the EventID and EventDate
  • EventID alone isn't unique and expect it to be repeated as an event could occur on different dates...
  • SQL Below Untested due to lack of sample data.

Option 1) using exists and correlation to only incldue those with a min/max event date per year which we determine in advance in a common table expression.

WITH IncludeSet as (  --Get the list of records we want to incldue.
  SELECT min(EventDate) EventDate, EventID, Year(EventDate) EventYear
  FROM tblEvent
  GROUP BY EventID,Year(EventDate) 

  UNION -- We don't need union all becuase if an event only occurs once in  
        -- a year, we only need to see that event date once. union does a 
        -- distinct union all wouldn't. 

 SELECT max(EventDate) EventDate, EventID, Year(EventDate) EventYear
  FROM tblEvent
  GROUP BY EventID,Year(EventDate)
)

SELECT E.*  --only pull in the columns you need
FROM tblEvent E
WHERE EXISTS (SELECT 1 
              FROM IncludeSet InS. 
              where E.EventID = InS.EventID
                and E.EventDate = InS.EventDate)

Alternative apporach using rownumber window function...

  • I use the Sub Commmon table expression to getnerate the row_numbers for the eventID, and year to ensure we can identify the min/max event of each year.
  • note the only real difference on the rownumbers is the order by.

.

With Sub as (
SELECT E.*
   , Row_number() over (partition by Year(eventDate), EventID Order by EventDate Desc) as maxRN
   , Row_number() over (partition by Year(eventDate), EventID Order by EventDate ASC) as MinRN
FROM tblEvent)

SELECT Sub.* 
FROM Sub 
WHERE (maxRN=1 or MinRN=1)
  • Depending on indexes on the table, table record size, and other factors one may be "better" than the other in terms of performance.
wvt8vs2t

wvt8vs2t2#

I managed to get it to display what I wanted:

drop table if exist #resultset

drop table if exists #row

with yearnum (EID,ED,YR,EN,EDS,CID,COD) 
AS 
(Select eventid,
min(eventdate) as minimum_event_date,
datepart(year,Eventdate) as year_in_question,
EventName,EventDetails,CategoryID,CountryID
from tblevent
group by eventid,EventDate,EventName,EventDetails,CategoryID,CountryID),

rownumS (EID,ED,YR,EN,EDS,CID,COD,RN) AS
(select EID,ED,YR,EN,EDS,CID,COD, row_number() over (partition by YR order by YR asc) as rownumyear
from yearnum)

select EID,ED,YR,EN,EDS,CID,COD,RN into #resultset from rownums;

select YR, MIN(ED),MAX(ED),MAX(RN) FROM #resultset
Group by YR
order by yr asc

相关问题