我有下面的Hive表:
product | price
A | 100
B | 102
C | 220
D | 240
E | 242
F | 410
对于每一行,我想用当前价格除以较低的价格,如果结果大于0.9,我想增加一个行号。如果结果小于0.9,则此行的行号应为1,当前价格变为较低价格,然后迭代。
结果应如下所示:
product | price | row_number
A | 100 | 1
B | 102 | 2
C | 220 | 1
D | 240 | 2
E | 242 | 3
F | 410 | 1
因为:
lower price = 100: product A get 1 as row_number
100/102 >= 0.9: product B get 2 as row_number
100/220 < 0.9: product C get 1 as row_number, lower price = 220
220/240 >= 0.9: product D get 2 as row_number
220/242 >= 0.9: product E get 3 as row_number
220/410 < 0.9: product F get 1 as row_number, lower price = 410
我在考虑创建一个临时的行号,按价格排序:
product | price | temp_row_number
A | 100 | 1
B | 102 | 2
C | 220 | 3
D | 240 | 4
E | 242 | 5
F | 410 | 6
然后:
Select
product,
price,
case
when lag(price,temp_row_number-1,0)/price over() >= 0.9 then lag(price,temp_row_number-1,0)
else price
end as test
from my_table
这将检索:
product | price | test
A | 100 | 100
B | 102 | 100
C | 220 | 220
D | 240 | 240
E | 242 | 242
F | 410 | 410
但理想情况下我想找回
product | price | test
A | 100 | 100
B | 102 | 100
C | 220 | 220
D | 240 | 220
E | 242 | 220
F | 410 | 410
所以我可以使用row\u number()函数order by product and price来计算row\u number row,并得到预期的结果。
1条答案
按热度按时间kninwzqo1#
派生表
(选择产品,价格,(价格介于100和200之间时为案例,价格介于200和300之间时为1,价格介于300和400之间时为2,然后为3结束)作为rn
从cte order by product中选择product,price,row\ U number()over(partition by rn order by rn)