group by where category具有最大值

btxsgosb  于 2021-05-27  发布在  Hadoop
关注(0)|答案(1)|浏览(431)

这是我的 t

+---------------+------------------------------+---------------+--+                                                                                          
    | trading_year  |            sector            | total_volume  |                                                                                             
    +---------------+------------------------------+---------------+--+                                                                                          
    | 2010          | Consumer Discretionary       | 119621915300  |                                                                                             
    | 2012          | Consumer Discretionary       | 93805031400   |                                                                                             
    | 2014          | Consumer Discretionary       | 80018436200   |                                                                                             
    | 2016          | Consumer Discretionary       | 81834656900   |                                                                                             
    | 2010          | Consumer Staples             | 50710518800   |                                                                                             
    | 2012          | Consumer Staples             | 39340784200   |                                                                                             
    | 2014          | Consumer Staples             | 33188087300   |                                                                                             
    | 2016          | Consumer Staples             | 37538472900   |

我需要根据按扇区分组的总卷的最大值和最小值来标记上述数据
所以我的输出应该是这样的

+---------------+------------------------------+---------------+----------------+                                                                                          
    | trading_year  |            sector            | total_volume  |     value                                                                                        
    +---------------+------------------------------+---------------+--------------+                                                                                          
    | 2010          | Consumer Discretionary       | 119621915300  |   maxvalue                                                                                            
    | 2012          | Consumer Discretionary       | 93805031400   |   stable value                                                                                         
    | 2014          | Consumer Discretionary       | 80018436200   |   stable value                                                                                          
    | 2016          | Consumer Discretionary       | 81834656900   |   minvalue                                                                                          
    | 2010          | Consumer Staples             | 50710518800   |   maxvalue                                                                                          
    | 2012          | Consumer Staples             | 39340784200   |   stable value                                                                                          
    | 2014          | Consumer Staples             | 33188087300   |   minvalue                                                                                          
    | 2016          | Consumer Staples             | 37538472900   |   stable value

这是我的问题,我试过,但没有运气

select *,
case when total_volume = max(total_volume) then 'maxvalue'
     when total_volume = min(total_volume) then 'minvalue'
     else 'stable value'
end value_type
from t
group by sector,trading_year

以上查询的输出

+---------------+------------------------------+---------------+----------------+                                                                                          
    | trading_year  |            sector            | total_volume  |     value                                                                                        
    +---------------+------------------------------+---------------+--------------+                                                                                          
    | 2010          | Consumer Discretionary       | 119621915300  |   maxvalue                                                                                            
    | 2012          | Consumer Discretionary       | 93805031400   |   maxvalue                                                                                         
    | 2014          | Consumer Discretionary       | 80018436200   |   maxvalue                                                                                          
    | 2016          | Consumer Discretionary       | 81834656900   |   maxvalue                                                                                       
    | 2010          | Consumer Staples             | 50710518800   |   maxvalue                                                                                          
    | 2012          | Consumer Staples             | 39340784200   |   maxvalue                                                                                         
    | 2014          | Consumer Staples             | 33188087300   |   maxvalue                                                                                         
    | 2016          | Consumer Staples             | 37538472900   |   maxvalue
vqlkdk9b

vqlkdk9b1#

你的sql在这里不起作用。不要使用 group by 因为你想保留所有的行。我们需要的是窗口和分析。

select 
    t.*, 
    case 
        when total_volume = last_value(total_volume)  over w then 'maxvalue'
        when total_volume = first_value(total_volume) over w then 'minvalue'
        else 'stable value'
    end as value 
from t
window w as (partition by sector order by total_volume 
    rows between unbounded preceding and unbounded following)

祝你今天愉快:)
更新:另一种(更简单的)方式:

select 
    t.*, 
    case 
        when total_volume = max(total_volume) over w then 'maxvalue'
        when total_volume = min(total_volume) over w then 'minvalue'
        else 'stable value'
    end as value 
from t
window w as (partition by sector)

注意,没有 order by ,默认窗口规范为 rows between unbounded proceeding and unbounded following . 鉴于 order by ,窗口规格默认为 rows between unbounded proceeding and current row .

相关问题