MySQL count rows with 1hour difference comparing to other row

hfyxw5xn  于 2022-12-22  发布在  Mysql
关注(0)|答案(1)|浏览(120)

I have the following query:

DROP TABLE IF EXISTS answered;
DROP TABLE IF EXISTS unanswered;

CREATE TABLE answered (
    time2 varchar(255),
    typeOfCall varchar(255),
    calledNumber varchar(255),
    callingNumber varchar(255),
    companyId varchar(255)
   
);

CREATE TABLE unanswered (
    time2 varchar(255),
    typeOfCall varchar(255),
    calledNumber varchar(255),
    callingNumber varchar(255),
    companyId varchar(255)
   
);

INSERT INTO answered (time2, typeOfCall, calledNumber, callingNumber, companyId) VALUES 
('2022-12-19 06:23:00', 'Answered', '+343245', '+3234343', '23234'),
('2022-12-19 08:22:30', 'Answered', '+243245', '+3234344', '23234' ),
('2022-12-19 08:23:00', 'Answered', '+243245', '+3234344', '23234' );

INSERT INTO unanswered (time2, typeOfCall, calledNumber, callingNumber, companyId) VALUES 
('2022-12-19 04:13:00', 'Unanswered', '+343245', '+3234343', '23234'),
('2022-12-19 04:35:00', 'Unanswered', '+343245', '+3234343', '23234'),
('2022-12-19 04:46:00', 'Unanswered', '+343245', '+3234343', '23234'),
('2022-12-19 06:13:00', 'Unanswered', '+343245', '+3234343', '23234'),
('2022-12-19 08:16:00', 'Unanswered', '+2421231', '+3234344', '23234' ),
('2022-12-19 08:22:00', 'Unanswered', '+2433425', '+3234344', '23234' );

 WITH
        t1 AS (
            SELECT DISTINCT 
            time2,
            typeOfCall,
            calledNumber,
            callingNumber,
            companyId
            FROM answered
           
        ),
        t2 AS (
        SELECT DISTINCT 
            time2,
            typeOfCall,
            calledNumber,
            callingNumber,
            companyId
            FROM unanswered
           
           
        ),
    t3 AS (
            SELECT time2, typeOfCall, calledNumber, callingNumber, companyId
            FROM t1
            UNION ALL
            SELECT time2, typeOfCall, calledNumber, callingNumber, companyId
            FROM t2
          ),
    t4 AS (
            SELECT row_number() OVER (PARTITION BY callingNumber, companyId ORDER BY time2) as row_numb,
                time2, typeOfCall, calledNumber, callingNumber, companyId
            FROM t3
          ),
    t5 AS (
            SELECT row_numb, time2, typeOfCall, calledNumber, callingNumber, companyId,
                lag(row_numb, 1, 0) OVER (PARTITION BY callingNumber, companyId ORDER BY time2) as info1
            FROM t4
            WHERE typeOfCall = 'Answered'
          ),
    t6 AS (
            SELECT row_numb, time2, typeOfCall, calledNumber, callingNumber, companyId,
            CASE 
                WHEN (row_numb-info1) = 1 THEN '1 attempt'
                WHEN (row_numb-info1) = 2 THEN '2 attempt'
                ELSE '3 attempt or more'
            END AS typeOfCall2
            FROM t5
          )
    SELECT time2, typeOfCall, typeOfcall2, calledNumber, callingNumber, companyId
    FROM t6

Output:

time2| typeOfcall| typeOfCall2  |  calledNumber  | callingNumber | companyId

2022-12-19 06:23:00 | Answered| 3attempt or more | +343245 | +3234343 |23234

2022-12-19 08:22:30 | Answered | 3attempt or more | +243245 | +3234344 | 23234

2022-12-19 08:23:00 | Answered | 1 attempt | +243245 | +3234344 | 23234

In this query I am making an union between two tables (answered and unanswered) and than I make a partition by so I can count the number of an unanswered call before an answered call, taking into account the callingNumber and the companyId (calledNumber is not relevant). What I am trying to achieve and I could not so far is to count only the unanswered calls with a time frame of 1hour before the answered call, all the other calls shouldn´t be counted. Any idea on how to solve this?
The desired output should be the following:

time2| typeOfcall| typeOfCall2  |  calledNumber  | callingNumber | companyId

2022-12-19 06:23:00 | Answered| 2attempt | +343245 | +3234343 |23234

2022-12-19 08:22:30 | Answered | 3 attempt or more | +243245 | +3234344 | 23234

2022-12-19 08:23:00 | Answered | 1 attempt | +243245 | +3234344 | 23234

In the inital data I should only count the following rows:

'2022-12-19 06:13:00', 'Unanswered', '+343245', '+3234343', '23234'
'2022-12-19 08:16:00', 'Unanswered', '+2421231', '+3234344', '23234' 
'2022-12-19 08:22:00', 'Unanswered', '+2433425', '+3234344', '23234'

Rows not counted:
('2022-12-19 04:13:00', 'Unanswered', '+343245', '+3234343', '23234'),
('2022-12-19 04:35:00', 'Unanswered', '+343245', '+3234343', '23234'),
('2022-12-19 04:46:00', 'Unanswered', '+343245', '+3234343', '23234'),
The reason to not count this rows is because they are more than 1 hour away of an answered call.

cbeh67ev

cbeh67ev1#

As proof of concept, if all that is needed are counts then a lot of partitioning can be avoided:

SELECT a.time2, a.typeOfCall, a.calledNumber, a.callingNumber, a.companyId, a.last_answered,
  count(distinct ua.time2) as unanswered_calls
FROM 
(
  select *, 
    LAG(a.time2, 1) OVER (PARTITION BY a.callingNumber, a.companyId ORDER BY a.time2) as last_answered
  from answered a
) a left join unanswered ua on a.callingnumber = ua.callingnumber
    and a.companyid = ua.companyid
    and (ua.time2 > a.last_answered or a.last_answered is null)
GROUP BY a.time2, a.typeOfCall, a.calledNumber, a.callingNumber, a.companyId, a.last_answered

You would need to filter on the last hour's calls.
The row_number() approach is also valid but if the final details are limited this may not be necessary.

相关问题