根据盘点结果查询类型分解

gcuhipw9  于 2022-10-03  发布在  Mysql
关注(0)|答案(2)|浏览(151)

我有下面的表<STATE_TABLE>,它跟踪id每分钟的条目及其状态:

minute id type
    ------ -- ----
    1      A  solid
    1      A  solid
    1      A  solid
    1      A  liquid
    1      B  solid
    1      B  solid
    1      B  liquid
    .... 1000+ rows ...
    1      ZZX  liquid
    1      ZZZ  liquid
    2      A  solid
    2      A  solid
    2      A  liquid

通过以下查询,我可以根据出现次数获得前1000对:

With TempIds AS (
   SELECT
      state_table.minute as minute,
      state_table.id as id,
      COUNT(*)
   FROM
      state_table
GROUP BY 1,2
) SELECT
   TempId.minute,
   TempId.id,
   TempId.count
FROM
   TempIds
ORDER BY 3 DESC
LIMIT 1000
;

例如:

minute id count
    ------ -- ----
    2      B  1002
    3      A  990
    1      C  800
    3      B  798

如何修改我的查询以获取ID的类型?例如,有1002<分钟=2,id=B>行。有没有办法找到402个固体和600个液体?

minute id count type
    ------ -- ---- -----
    2      B  402  solid
    2      B  600  liquid
    3      A  330  solid
    3      A  660  liquid

我能想到的唯一方法是一个相当复杂的嵌套查询:

With TempTop AS (
    With TempIds AS (
       SELECT
          state_table.minute as minute,
          state_table.id as id,
          COUNT(*)
       FROM
          state_table
    GROUP BY 1,2
    ) SELECT
       TempId.minute as minute,
       TempId.id as id,
       TempId.count
    FROM
       TempIds
    ORDER BY 3 DESC
    LIMIT 1000
    )
) SELECT
    state_table.minute,
    state_table.id,
    state_table.type,
    COUNT(*)
FROM
   state_table, TempTop
WHERE
    state_table.minute = TempTop.minute
    AND state_table.id = TempTop.id
;

有没有更简单的方法来进行这个查询?目标:

  • 对于前1000对最频繁的配对,获取类型的细分。
wrrgggsh

wrrgggsh1#

您的问题是

SELECT minute, id, COUNT(*)
FROM state_table
GROUP BY minute, id
ORDER BY COUNT(*) DESC
LIMIT 1000;

在此过程中,您会丢失类型,因为您需要前1000个分钟/id对,因此不能简单地使用group by minute, id, type

如果这只是关于‘Solid’和‘Liquid’类型,您可以应用条件聚合来获得单独的计数:

SELECT 
  minute, id, 
  COUNT(*) AS total,
  SUM(type = 'solid') AS solid,
  SUM(type = 'liquid') AS liquid
FROM state_table
GROUP BY minute, id
ORDER BY COUNT(*) DESC
LIMIT 1000;

汇总布尔表达式在MySQL中是有效的,因为在MySQL中,TRUE等于1,FALSE等于0。

顺便说一句,上述查询的问题是平局。如果有两对第1000对相同的对数,你可以任意挑选一对,而不是只显示999对或1001对,以便将两对平局视为相同。因此,我可能会使用DENSE_RANK重写查询,以便正确处理关联。

对于更复杂的情况,在编写查询时类型是未知的,您需要行而不是列,正如您的请求中已经显示的那样。在这种情况下,您确实需要首先执行group by minute, id, type。获取总数的最简单方法是使用SUM OVER。然后用DENSE_RANK对前面提到的配对进行排名,并保留前1000对。

SELECT minute, id, type, cnt
FROM
(
  SELECT
    minute, id, type, cnt,
    DENSE_RANK() OVER (ORDER BY total DESC) AS rnk
  FROM
  (
    SELECT
      minute, id, type,
      COUNT(*) AS cnt,
      SUM(COUNT(*)) OVER(PARTITION BY minute, id) AS total
    FROM state_table
    GROUP BY minute, id, type
  ) counted
) ranked
WHERE rnk <= 1000
ORDER BY rnk, minute, id, type;

在平局的情况下,这可以为您带来超过1000分钟/ID对。您可以使用RANK而不是DENSE_RANK来减少这一点。如果这两种方法仍然不能获得您想要的数字,您可能需要在子查询中分别计算分钟/id对:

select minute, id, type, COUNT(*)
from state_table
WHERE (minute, id) IN
(
  SELECT minute, id
  FROM state_table
  GROUP BY minute, id
  ORDER BY COUNT(*) DESC
  LIMIT 1000
)
GROUP BY minute, id, type
ORDER BY 
  SUM(COUNT(*)) OVER (PARTITION BY minute, id) DESC,
  minute, id, type;
8oomwypt

8oomwypt2#

我能想到的最好方法是,您仍然必须使用2个CTE,但不能使用嵌套的CTE。

你有3项任务要完成,

1.CTE会清点记录。
1.第二个函数将使用row_number()窗口函数根据降序或计数来分配行号。
1.LAST SELECT将使用WHERE子句在STATE表和第二个CTE之间进行连接,该子句指定ROW_NUNBER<=1000,并根据ID、State和Minint进行分组。

With TempIdsCount AS (
SELECT 
state_table.minute as minute, 
state_table.id as id, 
COUNT(*) rw_cnt 
FROM 
state_table 
GROUP BY 
1, 
2
) TempIdsRowNum AS (
SELECT 
minute, 
id, 
Row_number() over (order by rw_cnt desc) rw_num 
FROM 
TempIdsCount
) 
SELECT 
st.minute, 
st.id, 
st.type, 
Count(*) cnt 
FROM 
state_table st 
Join TempIdsRowNum trn on st.minute = trn.minute 
and st.id = trn.id 
Where 
rw_num <= 1000 
Group by 
st.minute, 
st.id, 
st.type

相关问题