sql排序-选择顶行中的顶行

bvpmtnay  于 2021-06-28  发布在  Hive
关注(0)|答案(1)|浏览(362)

为这个烂标题道歉-没能想出更好的。
我有下表:

  1. Customer_ID Item_ID sale_ID sale_TS
  2. 103293 I-0394039 S-430943 20161101

我需要找到销量最大的前100名客户,并为他们中的每一位找到他们在给定时间内购买的前100项商品。到目前为止,我的情况是:

  1. select vs.Customer_ID, vs.Item_ID, count(*) count2
  2. from sales.sales_import si1
  3. join
  4. (
  5. select Customer_ID, count(*) s_count2 from sales.sales_import where
  6. sale_TS between '2016-01-01' and '2016-01-31' group by Customer_ID order by sale_TS desc limit 100
  7. )
  8. si2
  9. on si1.Customer_ID = si2.Customer_ID
  10. where
  11. si1.sale_TS between '2016-01-01' and '2016-01-31'
  12. group by vs.Customer_ID, vs.Item_ID
  13. order by vs.Customer_ID, count2 desc limit 100

问题:
我基本上是把table连在一起,有更好的方法吗?
如何限制查询只返回每个客户id的前100个项目?这里的外部限制将限制所有行,而不是每个customerid的第一个x

ou6hu8tu

ou6hu8tu1#

请尝试使用row\u number函数。您必须构建两个派生表(from子句中使用的子查询)。一个给顾客,一个给他们的东西。内部联接子查询,以便仅从第一个派生表中返回的客户处获取项。

  1. select * from
  2. --get your top 100 customers
  3. (select * from
  4. (select Customer_ID, row_number() OVER (order by sale_TS) as rank
  5. from sales_import
  6. where sale_TS between '2016-01-01' and '2016-01-31'
  7. group by Customer_ID)
  8. where rank <= 100) custs
  9. --now build out a derived table that picks out the top 100 items they purchased using the same method

(从(select blah blah blah)项中选择blah blah blah blah blah)

  1. --now inner join your 2 derived tables
  2. where custs.Customer_ID, = items.Customer_ID

相关问题