如何用mysql中大数据集的count()重新定义慢sql查询

xxe27gdn  于 2021-08-13  发布在  Java
关注(0)|答案(3)|浏览(626)

我有一个这样的sql查询,我需要重新定义它,或者使用索引会有所帮助,但我不知道索引中包含哪些列。 b_answers 大约有上万行 b_projects 大约有几千行 b_users 有几十行
这些 AS count_* 排序需要列。

  1. SELECT
  2. p.id,
  3. p.datetime,
  4. u.name AS u_name,
  5. p.name,
  6. p.note,
  7. (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND changed != '0000-00-00 00:00:00') AS count_filled,
  8. (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND started = '1') AS count_started,
  9. (SELECT COUNT(id) FROM b_answers WHERE project = t.id) AS count_sent,
  10. (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '1' AND changed != '0000-00-00 00:00:00') AS count_filled_quiz1_a,
  11. (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '2' AND changed != '0000-00-00 00:00:00') AS count_filled_quiz1_b,
  12. (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '1' AND started = '1') AS count_started_quiz1_a,
  13. (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '2' AND started = '1') AS count_started_quiz1_b,
  14. (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '1') AS count_sent_quiz1_a
  15. (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '2') AS count_sent_quiz1_b,
  16. (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '3' AND changed != '0000-00-00 00:00:00') AS count_filled_quiz3_a,
  17. (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '4' AND changed != '0000-00-00 00:00:00') AS count_filled_quiz3_b,
  18. (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '3' AND started = '1') AS count_started_quiz3_a,
  19. (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '4' AND started = '1') AS count_started_quiz3_b,
  20. (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '3') AS count_sent_quiz3_a,
  21. (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '4') AS count_sent_quiz3_b,
  22. (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '5' AND changed != '0000-00-00 00:00:00') AS count_filled_quiz5_a,
  23. (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '6' AND changed != '0000-00-00 00:00:00') AS count_filled_quiz5_b,
  24. (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '5' AND started = '1') AS count_started_quiz5_a,
  25. (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '6' AND started = '1') AS count_started_quiz5_b,
  26. (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '5') AS count_sent_quiz5_a,
  27. (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '6') AS count_sent_quiz5_b,
  28. (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '7' AND changed != '0000-00-00 00:00:00') AS count_filled_quiz7_a,
  29. (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '8' AND changed != '0000-00-00 00:00:00') AS count_filled_quiz7_b,
  30. (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '7' AND started = '1') AS count_started_quiz7_a,
  31. (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '8' AND started = '1') AS count_started_quiz7_b,
  32. (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '7') AS count_sent_quiz7_a,
  33. (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '8') AS count_sent_quiz7_b
  34. FROM
  35. b_projects p
  36. LEFT JOIN
  37. b_answers a ON a.project = p.id
  38. LEFT JOIN
  39. b_users u ON u.id = p.admin
  40. GROUP BY
  41. p.nazev
kuhbmx9i

kuhbmx9i1#

使用条件聚合!这个想法是:

  1. SELECT p.id, p.datetime, u.name AS u_name, p.name, p.note,
  2. SUM(a.changed <> '0000-00-00 00:00:00') AS count_filled,
  3. SUM(a.started = '1') AS count_started,
  4. . . . -- and so one for the rest of the columns
  5. FROM b_projects p LEFT JOIN
  6. b_answers a
  7. ON a.project = p.id LEFT JOIN
  8. b_users u
  9. ON u.id = p.admin
  10. GROUP BY p.id, p.datetime, u.name, p.name, p.note
k0pti3hp

k0pti3hp2#

只是不要重新发明轮子!!!请看这个。哪些列通常是好的索引?
在用于比较(条件)的列中通常需要索引。所以在您的例子中,我认为可以用来提高成本的索引应该在查询的这一部分包含列。
left join b\u回答a on a.project=p.id/考虑使用索引/
left join b\u users u on u.id=p.admin/考虑使用索引/
p.nazev分组/考虑使用索引-仍不确定/
您可以用试错法检查查询的效果。
希望这有帮助。
干杯

brqmpdu1

brqmpdu13#

要获得正确的查询,需要考虑表是否处于1:manyMap中。如果是,您希望计数为“多”还是仅为“1”
我将假设您不需要膨胀值,因此我将首先获取派生表中的计数,然后连接到其他表:

  1. SELECT p.id, p.datetime, u.name AS u_name, p.name, p.note,
  2. count_filled, count_started, ...
  3. FROM
  4. ( SELECT
  5. SUM(a.changed <> '0000-00-00 00:00:00') AS count_filled,
  6. SUM(a.started = '1') AS count_started,
  7. ...
  8. FROM b_answers AS a
  9. ) AS aa
  10. JOIN b_projects p ON aa.project = p.id
  11. LEFT JOIN b_users u ON u.id = p.admin

请注意,这样可以避免 GROUP BY ,从而提供一个加速。二是避免“充气-放气”。
假设 idPRIMARY KEY 对于每个表,不需要额外的索引。

展开查看全部

相关问题