使用多个和优化sql查询?

zazmityj  于 2021-06-24  发布在  Hive
关注(0)|答案(1)|浏览(309)

我有一张table products :

+----------+-----------+----------+---------+
|family_id |shopper_id |product_id|quantity |
+----------+-----------+----------+---------+
|A         |1          |Kit Kat   |10       |
|A         |1          |Kit Kat   |5        |
|A         |1          |Snickers  |9        |
|A         |2          |Kit Kat   |7        |
|B         |3          |Kit Kat   |2        |
+----------+---------- +----------+---------+

对于每种产品,我要计算两个总数:
每位顾客的总数量
每个家庭的总数量。同一家庭中所有购物者的总数量之和。
最后一个表应该如下所示:

+----------+----------+-------------------------+-----------------------+
|shopper_id|product_id|total_quantity_shopper   |total_quantity_family  |
+----------+----------+-------------------------+-----------------------+
|1         |Kit Kat   | 15                      | 22                    |
|1         |Snickers  | 9                       | 9                     |
|2         |Kit Kat   | 7                       | 22                    |
|3         |Kit Kat   | 2                       | 2                     |
+----------+----------+-------------------------|-----------------------|

我的问题是:

SELECT
    distinct shopper_id,
    product_id,
    sum(quantity) OVER (PARTITION BY shopper_id, product_id) as total_quantity_shopper,
    sum(quantity) OVER (PARTITION BY family_id, product_id) as total_quantity_family
FROM
    products;

但是看看查询计划,它看起来非常低效(我认为)。如何改进上述查询?

5w9g7ksd

5w9g7ksd1#

我认为家庭是购物者的等级制度。所以,我建议 group by 和窗口功能:

select family_id, shopper_id, product_id,
       sum(quantity) as total_quantity_shopper,
       sum(sum(quantity)) over (partition by family_id, product_id) as total_quantity_family
from products
group by family_id, shopper_id, product_id

相关问题