创建带有计数的矩阵-配置单元sql

sqyvllje  于 2021-05-27  发布在  Hadoop
关注(0)|答案(2)|浏览(294)

有没有一种方法可以通过Hive实现这一点?我需要计算每个段的用户数。
我有一张table:
用户1,分类
用户1,类别B
用户2,类别
期望的输出是:

tzxcd3kk

tzxcd3kk1#

-------a类、b类、c类
a类——10
b类——10
c类--0 0 1

t5fffqht

t5fffqht2#

对于静态类别集,这是可能的:

with your_data as(
select stack (6, 
'user1', 'categoryA',
'user1', 'categoryB',
'user2', 'categoryC',
'user2', 'categoryC',
'user3', 'categoryA',
'user4', 'categoryA'                  
) as (`user`, category)
)

select 
      category, sum(catA) as CategoryA, sum(catB) as CategoryB, sum(catC) as CategoryC
from
(
  select `user` , category, --each user counted once per category
          max(case when category='categoryA' then 1 else 0 end) over (partition by `user`) as catA,
          max(case when category='categoryB' then 1 else 0 end) over (partition by `user`) as catB,
          max(case when category='categoryC' then 1 else 0 end) over (partition by `user`) as catC
  from your_data
   group by  `user` , category
)s
group by Category
order by category

结果:

category    categorya   categoryb   categoryc
categoryA      3           1           0
categoryB      1           1           0
categoryC      0           0           1

相关问题