如何优化我关于在同一个表中连接3个表的查询?

bybem2ql  于 2021-07-26  发布在  Java
关注(0)|答案(2)|浏览(270)

从三个月前开始,我想得到每月购买我产品的客户的身份证。今天是2020-02-15。所以我想得到在2019年11月,2019年12月,2020年1月购买的客户。
我只有一个表顺序(mysql),如下所示:
订单表(主键=id(自动递增)):

-----------------------------------------------
|      ID      |    id_cust  |    buy_date    |
-----------------------------------------------
|       1      |       10    |   2019-11-01   | 
|       2      |       11    |   2019-11-10   |
|       3      |       10    |   2019-12-11   |
|       4      |       12    |   2019-12-12   |
|       5      |       10    |   2020-01-13   |
|       6      |       11    |   2020-01-14   |
|       7      |       12    |   2020-01-15   |
-----------------------------------------------

根据我的要求,答案是id\u cust 10
我试过了,结果是这样的:

SELECT g1.`id_cust`
FROM `orders` g1 
    JOIN `orders` g2
    ON g2.`id_cust`   = g1.`id_cust`
      AND g2.`buy_date` >= STR_TO_DATE(CONCAT('01-', LPAD(MONTH(DATE_SUB(NOW(), INTERVAL 2 MONTH)), 2, '0'), '-', YEAR(DATE_SUB(NOW(), INTERVAL 2 MONTH))), '%d-%m-%Y')
      AND g2.`buy_date` < STR_TO_DATE(CONCAT('01-', LPAD(MONTH(DATE_SUB(NOW(), INTERVAL 1 MONTH)), 2, '0'), '-', YEAR(DATE_SUB(NOW(), INTERVAL 1 MONTH))), '%d-%m-%Y')
    JOIN `orders` g3
    ON g3.`id_cust`   = g1.`id_cust`
      AND g3.`id_cust`   = g2.`id_cust`
      AND g3.`buy_date` >= STR_TO_DATE(CONCAT('01-', LPAD(MONTH(DATE_SUB(NOW(), INTERVAL 1 MONTH)), 2, '0'), '-', YEAR(DATE_SUB(NOW(), INTERVAL 1 MONTH))), '%d-%m-%Y')
      AND g3.`buy_date` < STR_TO_DATE(CONCAT('01-', LPAD(MONTH(NOW()), 2, '0'), '-', YEAR(NOW())), '%d-%m-%Y')
WHERE g1.`buy_date` >= STR_TO_DATE(CONCAT('01-', LPAD(MONTH(DATE_SUB(NOW(), INTERVAL 3 MONTH)), 2, '0'), '-', YEAR(DATE_SUB(NOW(), INTERVAL 3 MONTH))), '%d-%m-%Y')
AND g1.`buy_date` < STR_TO_DATE(CONCAT('01-', LPAD(MONTH(DATE_SUB(NOW(), INTERVAL 2 MONTH)), 2, '0'), '-', YEAR(DATE_SUB(NOW(), INTERVAL 2 MONTH))), '%d-%m-%Y')
GROUP BY g1.`id_cust`

请帮助我简化我的语法,因为这是非常缓慢的时候,它是运行在大量的数据或如果我的语法错误,请纠正我的语法。

ecfsfe2w

ecfsfe2w1#

这个怎么样?

select c.id_cust
from (select o.id_cust, year(buy_date) as yyyy, month(buy_date) as mm,
             row_number() over (partition by o.id_cust) as month_counter
      from orders o
      where buy_date >= date_format(current_date - interval 3 month, '%Y-%m-%d') and
            buy_date < date_format(current_date, '%Y-%m-%d')
      group by id_cust, yyyy, mm
     ) c
where month_counter = 3;

这只会过滤到你关心的三个月。然后它按年份和月份进行聚合,只返回第三行。
实际上,这更容易表达为:

select o.id_cust
from orders o
where buy_date >= date_format(current_date - interval 3 month, '%Y-%m-%d') and
      buy_date < date_format(current_date, '%Y-%m-%d')
group by o.id_cust
having count(distinct year(buy_date), month(buy_date)) = 3;
mrfwxfqh

mrfwxfqh2#

我会用戈登的第二个问题。但是如果您的代码正常工作(作为练习),您可以通过在上创建索引来优化执行时间 (buy_date, id_cust) 继续 (id_cust, buy_date) . 第一个用于where子句,第二个用于on子句。
使用此架构

CREATE TABLE orders (
  `ID` INTEGER primary key,
  `id_cust` INTEGER,
  `buy_date` VARCHAR(10)
);

查询的结果是

| id  | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra                                              |
| --- | ----------- | ----- | ---------- | ---- | ------------- | --- | ------- | --- | ---- | -------- | -------------------------------------------------- |
| 1   | SIMPLE      | g1    |            | ALL  |               |     |         |     | 7    | 14.29    | Using where; Using temporary; Using filesort       |
| 1   | SIMPLE      | g2    |            | ALL  |               |     |         |     | 7    | 14.29    | Using where; Using join buffer (Block Nested Loop) |
| 1   | SIMPLE      | g3    |            | ALL  |               |     |         |     | 7    | 14.29    | Using where; Using join buffer (Block Nested Loop) |

没有使用键,“块嵌套循环”听起来很糟糕。
添加索引后

ALTER TABLE orders ADD INDEX (buy_date, id_cust);
ALTER TABLE orders ADD INDEX (id_cust, buy_date);
| id  | select_type | table | partitions | type  | possible_keys    | key     | key_len | ref             | rows | filtered | Extra                    |
| --- | ----------- | ----- | ---------- | ----- | ---------------- | ------- | ------- | --------------- | ---- | -------- | ------------------------ |
| 1   | SIMPLE      | g1    |            | index | buy_date,id_cust | id_cust | 48      |                 | 7    | 14.29    | Using where; Using index |
| 1   | SIMPLE      | g2    |            | ref   | buy_date,id_cust | id_cust | 5       | test.g1.id_cust | 2    | 14.29    | Using where; Using index |
| 1   | SIMPLE      | g3    |            | ref   | buy_date,id_cust | id_cust | 5       | test.g1.id_cust | 2    | 14.29    | Using where; Using index |

小提琴
现在看起来好多了,尽管它不再使用我的第一个索引(可能是由于groupby)。
然后我将查询简化为:

SELECT DISTINCT g1.id_cust
FROM orders g1 
    JOIN orders g2 ON g2.id_cust = g1.id_cust
    JOIN orders g3 ON g3.id_cust = g1.id_cust
    -- AND g3.id_cust  = g2.id_cust -- redundant condition
WHERE g1.buy_date >= DATE_FORMAT(NOW() - INTERVAL 3 MONTH, '%Y-%m-01')
  AND g1.buy_date <  DATE_FORMAT(NOW() - INTERVAL 2 MONTH, '%Y-%m-01')
  AND g2.buy_date >= DATE_FORMAT(NOW() - INTERVAL 2 MONTH, '%Y-%m-01')
  AND g2.buy_date <  DATE_FORMAT(NOW() - INTERVAL 1 MONTH, '%Y-%m-01')
  AND g3.buy_date >= DATE_FORMAT(NOW() - INTERVAL 1 MONTH, '%Y-%m-01')
  AND g3.buy_date <  DATE_FORMAT(NOW() - INTERVAL 0 MONTH, '%Y-%m-01')
-- GROUP BY g1.id_cust -- You can use DISTINCT instead

小提琴

相关问题