SQL Server Defining first event: Already present or newly occured?

ewm0tg9j  于 2023-06-28  发布在  其他
关注(0)|答案(2)|浏览(88)

I have longitudinal data that is stored in different tables for each year and want to verify if an event 'T' is already present or has newly occured and derive the eventdate

Example data

db<>fiddle

CREATE TABLE Table_2010 (
    ID INT,
    EventDate DATE,
    Event CHAR(1)
);

CREATE TABLE Table_2011 (
    ID INT,
    EventDate DATE,
    Event CHAR(1)
);

CREATE TABLE Table_2012 (
    ID INT,
    EventDate DATE,
    Event CHAR(1)
);

CREATE TABLE Table_2013 (
    ID INT,
    EventDate DATE,
    Event CHAR(1)
);

CREATE TABLE Table_2014 (
    ID INT,
    EventDate DATE,
    Event CHAR(1)
);

INSERT INTO Table_2010 (ID, EventDate, Event) VALUES
    (1, '2010-01-01', 'U'),
    (1, '2010-02-01', 'U'),
    (2, '2010-01-15', 'T'),
    (2, '2010-02-15', 'V');

INSERT INTO Table_2011 (ID, EventDate, Event) VALUES
    (1, '2011-01-01', 'T'),
    (1, '2011-02-01', 'V'),
    (2, '2011-01-15', 'X'),
    (2, '2011-02-15', 'Z'),
    (2, '2011-03-01', 'T'),
    (3, '2011-02-20', 'T'),
    (3, '2011-03-30', 'Z');

INSERT INTO Table_2012 (ID, EventDate, Event) VALUES
    (1, '2012-01-01', 'U'),
    (1, '2012-02-01', 'T'),
    (2, '2012-01-15', 'T'),
    (2, '2012-02-15', 'Z'),
    (2, '2012-03-01', 'Z');

INSERT INTO Table_2013 (ID, EventDate, Event) VALUES
    (1, '2013-01-01', 'T'),
    (1, '2013-02-01', 'Z'),
    (2, '2013-01-15', 'T'),
    (2, '2013-02-15', 'Y');

INSERT INTO Table_2014 (ID, EventDate, Event) VALUES
    (1, '2014-01-01', 'Z'),
    (1, '2014-02-01', 'T'),
    (2, '2014-01-15', 'T'),
    (2, '2014-02-15', 'X'),
    (2, '2014-03-01', 'Z');

So my go-to approach would be as follows:

SELECT ID, 
    MIN(CASE WHEN Event = 'T' THEN EventDate END) AS T_StartDate
FROM (
    SELECT ID, EventDate, Event
    FROM Table_2010
    WHERE Event IN ('T')
    UNION ALL
    SELECT ID, EventDate, Event
    FROM Table_2011
    WHERE Event IN ('T')
    UNION ALL
    SELECT ID, EventDate, Event
    FROM Table_2012
    WHERE Event IN ('T')
    UNION ALL
    SELECT ID, EventDate, Event
    FROM Table_2013
    WHERE Event IN ('T')
    UNION ALL
    SELECT ID, EventDate, Event
    FROM Table_2014
    WHERE Event IN ('T')
) AS AllEvents
GROUP BY ID;

The problem is that i do not know what happened before the first entry of an ID .

Consider ID 1 and 2 which are present in Table_2010 :

ID 1 does not have the event in 2010 but in 2011. Because the event did not happen before 2011 i would classify this event as newly occured whereas ID 2 already had the event in 2010 and needs to be considered as already present.

In my opinion i need to verify if there is an event-free period before the first occurence.

How can i implement that?

Expected output (dummy coded 1: Yes; 0: No)
| ID | Eventdate | Newly occured |
| ------------ | ------------ | ------------ |
| 1 | 2011-01-01 | 1 |
| 2 | 2010-01-15 | 0 |
| 3 | 2011-02-20 | 0 |

c9qzyr3d

c9qzyr3d1#

This might still need some work on, as I still don't understand how later years affect math, but based on explanations so far, maybe something in this direction can work?

I added one mid step to better explain:

WITH CTE_T_Events AS
(
    SELECT ID, EventDate, Event
    FROM Table_2010
    WHERE Event IN ('T')
    UNION ALL
    SELECT ID, EventDate, Event
    FROM Table_2011
    WHERE Event IN ('T')
    UNION ALL
    SELECT ID, EventDate, Event
    FROM Table_2012
    WHERE Event IN ('T')
    UNION ALL
    SELECT ID, EventDate, Event
    FROM Table_2013
    WHERE Event IN ('T')
    UNION ALL
    SELECT ID, EventDate, Event
    FROM Table_2014
    WHERE Event IN ('T')
), 
CTE_Group AS
(
    SELECT ID, MIN(EventDate) AS T_StartDate 
    , CASE WHEN EXISTS (SELECT * FROM Table_2010 x WHERE x.ID = t.ID) THEN 1 ELSE 0 END as Exists2010
    , CASE WHEN Exists (SELECT * FROM Table_2010 y WHERE y.ID = t.ID AND y.Event ='T') THEN 1 ELSE 0 END as T_Exists2010
    FROM CTE_T_Events t
    GROUP BY ID
)
SELECT *
, CASE WHEN Exists2010 = 1 AND T_Exists2010 = 0 THEN 1 ELSE 0 END as NewlyOccured
FROM CTE_Group

Fiddle with and without 3-X

8gsdolmq

8gsdolmq2#

The With clause is in this case needed as i also wanted to have the minimum ID, to determine if it is the evenet

But we don't recommend having multiple tables of identical design, as long there are not really good causes for that.

WITH AllEvents as (    SELECT ID, EventDate, Event
    FROM Table_2010
    WHERE Event IN ('T')
    UNION ALL
    SELECT ID, EventDate, Event
    FROM Table_2011
    WHERE Event IN ('T')
    UNION ALL
    SELECT ID, EventDate, Event
    FROM Table_2012
    WHERE Event IN ('T')
    UNION ALL
    SELECT ID, EventDate, Event
    FROM Table_2013
    WHERE Event IN ('T')
    UNION ALL
    SELECT ID, EventDate, Event
    FROM Table_2014
    WHERE Event IN ('T')
)
SELECT ID, MIN(event) event, 
    MIN(EventDate ) AS T_StartDate,
 CASE WHEN ID = (SELECT MIN(ID) FROM AllEvents) THEN 1 ELSE 0 END new_evenbnt6
FROM AllEvents

GROUP BY ID
IDeventT_StartDatenew_evenbnt6
1T2011-01-011
2T2010-01-150
3T2011-02-200

fiddle

For your new requirement you need to add also AllEvents

WITH AllEvents as (    SELECT ID, EventDate, Event
    FROM Table_2010
    WHERE Event IN ('T','X')
    UNION ALL
    SELECT ID, EventDate, Event
    FROM Table_2011
    WHERE Event IN ('T','X')
    UNION ALL
    SELECT ID, EventDate, Event
    FROM Table_2012
    WHERE Event IN ('T','X')
    UNION ALL
    SELECT ID, EventDate, Event
    FROM Table_2013
    WHERE Event IN ('T','X')
    UNION ALL
    SELECT ID, EventDate, Event
    FROM Table_2014
    WHERE Event IN ('T','X')
)
SELECT ID,  event,
    MIN(EventDate ) AS T_StartDate,
 CASE WHEN Year(MIN(EventDate )) = (SELECT Year(MIN(EventDate )) FROM AllEvents) THEN 1 ELSE 0 END new_evenbnt6
FROM AllEvents

GROUP BY ID, event
IDeventT_StartDatenew_evenbnt6
1T2011-01-010
2T2010-01-151
3T2011-02-200
2X2011-01-150
3X2010-02-151

fiddle

相关问题