如何用mysql中大数据集的count()重新定义慢sql查询

xxe27gdn  于 2021-08-13  发布在  Java
关注(0)|答案(3)|浏览(500)

我有一个这样的sql查询,我需要重新定义它,或者使用索引会有所帮助,但我不知道索引中包含哪些列。 b_answers 大约有上万行 b_projects 大约有几千行 b_users 有几十行
这些 AS count_* 排序需要列。

SELECT
    p.id,
    p.datetime,
    u.name AS u_name,
    p.name,
    p.note,
    (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND  changed != '0000-00-00 00:00:00') AS count_filled,
    (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND started = '1') AS count_started,
    (SELECT COUNT(id) FROM b_answers WHERE project = t.id) AS count_sent,
    (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '1' AND  changed != '0000-00-00 00:00:00') AS count_filled_quiz1_a,
    (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '2' AND  changed != '0000-00-00 00:00:00') AS count_filled_quiz1_b,
    (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '1' AND started = '1') AS count_started_quiz1_a,
    (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '2' AND started = '1') AS count_started_quiz1_b,
    (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '1') AS count_sent_quiz1_a
    (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '2') AS count_sent_quiz1_b,
    (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '3' AND  changed != '0000-00-00 00:00:00') AS count_filled_quiz3_a,
    (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '4' AND  changed != '0000-00-00 00:00:00') AS count_filled_quiz3_b,
    (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '3' AND started = '1') AS count_started_quiz3_a,
    (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '4' AND started = '1') AS count_started_quiz3_b,
    (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '3') AS count_sent_quiz3_a,
    (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '4') AS count_sent_quiz3_b,
    (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '5' AND  changed != '0000-00-00 00:00:00') AS count_filled_quiz5_a,
    (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '6' AND  changed != '0000-00-00 00:00:00') AS count_filled_quiz5_b,
    (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '5' AND started = '1') AS count_started_quiz5_a,
    (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '6' AND started = '1') AS count_started_quiz5_b,
    (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '5') AS count_sent_quiz5_a,
    (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '6') AS count_sent_quiz5_b,
    (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '7' AND  changed != '0000-00-00 00:00:00') AS count_filled_quiz7_a,
    (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '8' AND  changed != '0000-00-00 00:00:00') AS count_filled_quiz7_b,
    (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '7' AND started = '1') AS count_started_quiz7_a,
    (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '8' AND started = '1') AS count_started_quiz7_b,
    (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '7') AS count_sent_quiz7_a,
    (SELECT COUNT(id) FROM b_answers WHERE project = t.id AND quiz = '8') AS count_sent_quiz7_b
FROM 
    b_projects p
LEFT JOIN 
    b_answers a ON a.project = p.id
LEFT JOIN 
    b_users u ON u.id = p.admin
GROUP BY
    p.nazev
kuhbmx9i

kuhbmx9i1#

使用条件聚合!这个想法是:

SELECT p.id, p.datetime, u.name AS u_name, p.name, p.note,
       SUM(a.changed <> '0000-00-00 00:00:00') AS count_filled,
       SUM(a.started = '1') AS count_started,
       . . .   -- and so one for the rest of the columns
FROM b_projects p LEFT JOIN 
     b_answers a 
     ON a.project = p.id LEFT JOIN 
     b_users u
     ON u.id = p.admin
GROUP BY p.id, p.datetime, u.name, p.name, p.note
k0pti3hp

k0pti3hp2#

只是不要重新发明轮子!!!请看这个。哪些列通常是好的索引?
在用于比较(条件)的列中通常需要索引。所以在您的例子中,我认为可以用来提高成本的索引应该在查询的这一部分包含列。
left join b\u回答a on a.project=p.id/考虑使用索引/
left join b\u users u on u.id=p.admin/考虑使用索引/
p.nazev分组/考虑使用索引-仍不确定/
您可以用试错法检查查询的效果。
希望这有帮助。
干杯

brqmpdu1

brqmpdu13#

要获得正确的查询,需要考虑表是否处于1:manyMap中。如果是,您希望计数为“多”还是仅为“1”
我将假设您不需要膨胀值,因此我将首先获取派生表中的计数,然后连接到其他表:

SELECT  p.id, p.datetime, u.name AS u_name, p.name, p.note,
        count_filled, count_started, ...
    FROM  
        ( SELECT
                SUM(a.changed <> '0000-00-00 00:00:00') AS count_filled,
                SUM(a.started = '1') AS count_started,
                ...
            FROM  b_answers AS a 
        ) AS aa
    JOIN  b_projects p  ON aa.project = p.id
    LEFT JOIN  b_users u  ON u.id = p.admin

请注意,这样可以避免 GROUP BY ,从而提供一个加速。二是避免“充气-放气”。
假设 idPRIMARY KEY 对于每个表,不需要额外的索引。

相关问题