在SQL分组查询中,我们通常希望应用一些复杂的公式,并在返回列中返回分组后的值。例如,在下面的查询中,EXTRACT
、CASE
和OTHER_...
似乎在GROUP BY
和SELECT
列之后逐点重复。
样板解决方案:
SELECT
EXTRACT(MONTH FROM date) AS "month",
CASE
WHEN price < 100 THEN 'low_price'
ELSE 'high_price'
END AS price_category,
OTHER_COMPLEX_GROUPING_FUNCTION(price) AS foo,
AVG(bar) AS avg_bar,
AVG(baz) AS avg_baz,
SUM(lorem) AS total_lorem,
SUM(ipsum) AS total_ipsum
FROM
my_table
WHERE
date > '2023-01-01'
GROUP BY
EXTRACT(MONTH FROM date),
CASE WHEN price < 100 THEN 'low_price' ELSE 'high_price' END,
OTHER_COMPLEX_GROUPING_FUNCTION(price)
返回列中的大量重复会导致代码更冗长、可读性更差。这也可能导致重复计算工作(请确认)。
这让我想到“如果我们可以在GROUP BY
中动态定义这些列,那岂不是很棒?“。例如
我的理想解决方案:
SELECT "month"
, price_category
, foo
, AVG(bar) AS avg_bar
, AVG(baz) AS avg_baz
, SUM(lorem) AS total_lorem
, SUM(ipsum) AS total_ipsum
FROM my_table
WHERE date > '2023-01-01'
GROUP BY EXTRACT(MONTH FROM date) AS "month"
, CASE WHEN price < 100 THEN 'low_price' ELSE 'high_price' END AS price_category
, OTHER_COMPLEX_GROUPING_FUNCTION(price) AS foo
是否有任何SQL方言提供了一些方法来避免GROUP BY
中的这种重复?若否,原因为何?
(我个人最感兴趣的是找到一个Postgres解决方案,但这个问题是通用的)
一种解决方案是使用子查询。我怀疑像下面这样的子查询在技术上可以实现相同的结果,但它们似乎可读性和习惯性都不太好-特别是如果您想要包括其他几个最终返回列。
现有解决方案:
SELECT "month"
, price_category
, foo
, AVG(price) AS avg_price
, AVG(bar) AS avg_bar
, AVG(baz) AS avg_baz
, SUM(lorem) AS total_lorem
, SUM(ipsum) AS total_ipsum
FROM (SELECT EXTRACT(MONTH FROM my_table) AS "month"
, CASE WHEN price < 100 THEN 'low_price' ELSE 'high_price' END AS price_category
, OTHER_COMPLEX_GROUPING_FUNCTION(price) AS foo
, price
, bar
, baz
, lorem
, ipsum
FROM my_table
WHERE date > '2023-01-01'
)
GROUP BY "month", price_category, foo
1条答案
按热度按时间aelbi1ox1#
使用列 * 编号 *: