sql server—如何在sql中按连续记录分组

zbsbpyhn  于 2021-08-13  发布在  Java
关注(0)|答案(3)|浏览(953)

我的table上有这些记录

ID  Colour
------------
 1   Red
 2   Red
 3   Red
 4   Red
 5   Red
 6   Green
 7   Green
 8   Green
 9   Green
10   Red
11   Red
12   Red
13   Red
14   Green
15   Green
16   Green
17   Blue
18   Blue
19   Red
20   Blue

我可以像这样轻松地按颜色分组

SELECT Colour, MIN(ID) AS iMin, MAX(ID) AS iMax
FROM MyTable
GROUP BY Colour

这将返回此结果

Colour     iMin     iMax
-------------------------
Red        1        19
Green      6        16
Blue       17       20

但这不是我想要的,因为红色不是从1一直到19,绿色打破了顺序。
结果应该是这样的

Colour     iMin     iMax
------------------------
Red        1        5
Green      6        9
Red        10       13
Green      14       16
Blue       17       18
Red        19       19
Blue       20       20

我设法通过游标来实现这一点,但不知道是否有更有效的方法来实现这一点

oalqel3c

oalqel3c1#

这是一个缺口和孤岛问题。假设 id 是连续递增的,可以使用 row_number() 定义具有相同属性的“相邻”记录组 colour :

select 
    colour, 
    min(id) iMin,
    max(id) iMax
from (
    select t.*, row_number() over(partition by colour order by id) rn
    from mytable t
) t
group by colour, id - rn
order by min(id)

db小提琴演示:

colour | iMin | iMax
:----- | ---: | ---:
Red    |    1 |    5
Green  |    6 |    9
Red    |   10 |   13
Green  |   14 |   16
Blue   |   17 |   18
Red    |   19 |   19
Blue   |   20 |   20
gab6jxml

gab6jxml2#

这是一个缺口和岛屿问题。您可以通过行号的不同来解决这个问题:

select colour, min(id), max(id)
from (select t.*,
             row_number() over (order by id) as seqnum,
             row_number() over (partition by colour order by id) as seqnum_c
      from t
     ) t
group by colour, (seqnum - seqnum_c);

这是一把小提琴。
很难解释这是怎么回事。但是,如果您查看子查询的结果,您将看到行号的差异如何标识相邻的颜色。

xoefb8l8

xoefb8l83#

无论id列是否为整数,id列的值是否连续,该查询都有效。

;with c0 as(
select id, color,
       ROW_NUMBER() over(order by id)*
       (case when color <> LAG(color, 1, '') over(order by id) then 1 else 0 end) as color_id
from #temp
), c1 as(
select id, color, color_id, SUM(color_id) over(order by id) as color_gid
from c0
)
select color, MIN(id) as idMin, MAX(id) as idMax
from c1
group by color, color_gid

它可以扩展为按a列排序,按b列的连续值分组,并查找c列的聚合值,如下所示:

;with c0 as(
select C, B,
       ROW_NUMBER() over(order by A)*
       (case when B <> LAG(B, 1, '') over(order by A) then 1 else 0 end) as B_id
from TableName
), c1 as(
select C, B, B_id, SUM(B_id) over(order by A) as B_gid
from c0
)
select B, MIN(C) as CMin, MAX(C) as CMax
from c1
group by B, B_gid

相关问题