hivesql内部连接-如何在同一查询中获取sum和row\u num

mfuanj7w  于 2021-06-27  发布在  Hive
关注(0)|答案(2)|浏览(395)

我有两个表product和psales,表中的数据如下所示

  1. select * from psales;
  2. +-------------+---------------+--+
  3. | psales.pid | psales.sales |
  4. +-------------+---------------+--+
  5. | 1 | 100 |
  6. | 1 | 150 |
  7. | 1 | 200 |
  8. | 2 | 75 |
  9. | 2 | 45 |
  10. | 2 | 145 |
  11. | 3 | 176 |
  12. | 3 | 99 |
  13. | 1 | 27 |
  14. | 4 | 51 |
  15. +-------------+---------------+--+
  16. select * from product;
  17. +--------------+----------------+--+
  18. | product.pid | product.pname |
  19. +--------------+----------------+--+
  20. | 1 | p1 |
  21. | 2 | p2 |
  22. | 3 | p3 |
  23. | 4 | p4 |
  24. +--------------+----------------+--+

目标是获得综合销售额第二高的产品。
这是我目前用来获得最高综合销售额的产品的查询(运行良好)

  1. select p1.pname, p1.total_sales
  2. from (select p.pid as pid, p.pname as pname, s.sales as sales,
  3. sum(s.sales) over (partition by p.pid order by p.pid) as total_sales
  4. from product p
  5. inner join psales s on (p.pid = s.pid)
  6. order by total_sales desc) p1
  7. limit 1;

如何获得综合销售额第二高的产品?
当我尝试在内部查询中获取行\ num时,它给出如下错误:

  1. select p1.pname as pname, p1.total_sales as total_sales, row_num() over (partition by pname order by pname) as rownum
  2. from (select p.pid as pid, p.pname as pname, s.sales as sales,
  3. sum(s.sales) over (partition by p.pid order by p.pid) as total_sales,
  4. row_num() over (partition by p.pid) as rownum
  5. from product p
  6. inner join psales s on (p.pid = s.pid)
  7. order by total_sales desc) p1
  8. where rownum =2;

错误:编译语句时出错:失败:semanticexception无法将窗口调用拆分为组。至少有一个组只能依赖于输入列。还要检查循环依赖关系。基本错误:函数行数无效(状态=42000,代码=40000)
提前谢谢你的帮助。

3pvhb19x

3pvhb19x1#

你可以用 dense_rank 要对每个pid的和进行排序,请命名组合。

  1. select p1.pname,p1.pid,p1.total_sales
  2. from (select p.pid, p.pname,sum(s.sales) as total_sales,
  3. dense_rank() over(order by sum(s.sales) desc) as rnk
  4. from product p
  5. join psales s on p.pid = s.pid
  6. group by p.pid,p.pname
  7. ) p1
  8. where rnk=2
t8e9dugd

t8e9dugd2#

使用 row_number() 函数。看来你不需要分析 sum() ,简单 group by 将做:

  1. select p1.pname, p1.pid, p1.total_sales
  2. from
  3. (
  4. select p1.pname, p1.pid, p1.total_sales,
  5. row_number() over (order by total_sales desc) rn
  6. from
  7. (select p.pid, p.pname, sum(s.sales) as total_sales
  8. from product p
  9. inner join psales s on p.pid = s.pid
  10. group by p.pid, p.pname
  11. )p1
  12. )s
  13. where rn=2
  14. ;

使用 dense_rank() 而不是 row_number() 如果要选择具有相同销售额的所有产品。

展开查看全部

相关问题