我正在努力做一个决定 GROUP BY
连续块,我使用了以下两个作为引用:
-sql中连续行的分组依据
-如何在mysql中实现连续分组?
- https://gcbenison.wordpress.com/2011/09/26/queries-that-group-tables-by-contiguous-blocks/
我试图用给定状态的开始和结束日期来封装时段的主要思想。与其他示例不同的是,我使用每个房间的日期id作为索引字段(而不是顺序id)。
我的table:
room_id | calendar_date | state
样本数据:
1 | 2016-03-01 | 'a'
1 | 2016-03-02 | 'a'
1 | 2016-03-03 | 'a'
1 | 2016-03-04 | 'b'
1 | 2016-03-05 | 'b'
1 | 2016-03-06 | 'c'
1 | 2016-03-07 | 'c'
1 | 2016-03-08 | 'c'
1 | 2016-03-09 | 'c'
2 | 2016-04-01 | 'b'
2 | 2016-04-02 | 'a'
2 | 2016-04-03 | 'a'
2 | 2016-04-04 | 'a'
目标是:
room_id | date_start | date_end | state
1 | 2016-03-01 | 2016-03-03 | a
1 | 2016-03-04 | 2016-03-05 | b
1 | 2016-03-06 | 2016-03-09 | c
2 | 2016-04-01 | 2016-04-01 | b
2 | 2016-04-02 | 2016-04-04 | c
我在这方面做了两次尝试:
1)
SELECT
rooms.row_new,
rooms.state_new,
MIN(rooms.room_id) AS room_id,
MIN(rooms.state) AS state,
MIN(rooms.date) AS date_start,
MAX(rooms.date) AS date_end,
FROM
(
SELECT @r := @r + (@state != state) AS row_new,
@state := state AS state_new,
rooms.*
FROM (
SELECT @r := 0,
@state := ''
) AS vars,
rooms_vw
ORDER BY room_id, date
) AS rooms
WHERE room_id = 1
GROUP BY row_new
ORDER BY room_id, date
;
这是非常接近工作,但当我打印出新的行时,它开始跳转(1,2,3,5,7,…)
2)
SELECT
MIN(rooms_final.calendar_date) AS date_start,
MAX(rooms_final.calendar_date) AS date_end,
rooms_final.state,
rooms_final.room_id,
COUNT(*)
FROM (SELECT
rooms.date,
rooms.state,
rooms.room_id,
CASE
WHEN rooms_merge.state IS NULL OR rooms_merge.state != rooms.state THEN
@rownum := @rownum+1
ELSE
@rownum
END AS row_num
FROM rooms
JOIN (SELECT @rownum := 0) AS row
LEFT JOIN (SELECT rooms.date + INTERVAL 1 DAY AS date,
rooms.state,
rooms.room_id
FROM rooms) AS rooms_merge ON rooms_merge.calendar_date = rooms.calendar_date AND rooms_merge.room_id = rooms.room_id
ORDER BY rooms.room_id, rooms.calendar_date
) AS rooms_final
GROUP BY rooms_final.state, rooms_final.row_num
ORDER BY room_id, calendar_date;
由于某些原因,这会返回一些nullroom\u id的结果,而且通常是不准确的。
2条答案
按热度按时间hgncfbus1#
感谢@gordon linoff为我提供了获得这个答案的见解:
mzillmmw2#
使用变量有点棘手。我会选择:
笔记:
在一个表达式中分配变量并在另一个表达式中使用它是不安全的。mysql不保证表达式的求值顺序。
在较新版本的mysql中,您需要
ORDER BY
在子查询中。在最新版本中,可以使用
row_number()
,大大简化了计算。