hive获取选择任何课程的男女人数

js81xvg6  于 2021-06-26  发布在  Hive
关注(0)|答案(6)|浏览(322)

我有两张table,学生桌和培训桌。学生和培训表格如下。

Student
ID      name    age     sex     salary
1213    lavanya 18      Female  8000
1208    reshma  19      Female  14000
1207    bhavya  20      Female  15000
1212    Arshad  28      Male    20000
1209    kranthi 22      Male    22000
1210    Satish  24      Male    25000
1211    Krishna 25      Male    26000
1203    khaleel 34      Male    30000
1204    prasant 30      Male    31000
1206    laxmi   25      Female  35000
1205    kiran   20      Male    40000
1201    gopal   45      Male    50000
1202    manisha 40      Female  51000

Training
1       1201    csharp
2       1205    c
3       1201    c
4       1202    java
5       1205    java
6       1203    shell
7       1204    hadoop
8       1201    hadoop

现在我要统计参加过任何课程的男女人数。我试过了-

hive> select s.sex, count(*) from student join training t on s.id=t.sid group by s.sex;

但是这个查询的输出是女性2男性4,虽然预期的结果应该是女性1男性2请注意这是一个样本和正在使用的数据的简短形式。

vs91vp4v

vs91vp4v1#

我尝试了mysql和oracle,这个查询是可以的。

SELECT S.sex, count(*) 
FROM student s 
INNER JOIN training T on S.id = T.sid 
GROUP BY S.sex;

结果,女性=1,男性=2

amrnrhlw

amrnrhlw2#

这看起来像您的查询,但是-返回您提到的结果(1个女性,2个男性)。如果可能的话,发布您自己的sql*plus复制/粘贴会话(以我的示例为例),这样我们就可以看到您到底做了什么)。

SQL> with student (id, name, sex) as
  2    (select 1, 'alex', 'm' from dual union
  3     select 2, 'rita', 'f' from dual union
  4     select 3, 'max',  'm' from dual union
  5     select 4, 'steve', 'm' from dual
  6    ),
  7  training (id, sid, course) as
  8    (select 1, 2, 'java' from dual union
  9     select 2, 3, 'c' from dual union
 10     select 3, 1, 'java' from dual
 11    )
 12  select s.sex, count(*)
 13  from student s join training t on t.sid = s.id
 14  group by s.sex;

S   COUNT(*)
- ----------
m          2
f          1
sirbozc5

sirbozc53#

在这里我写了一个代码data:-

SELECT 
    final.ct_sex as sex,count(*) as num
FROM 
    (SELECT tb.sex as ct_sex FROM newschema.mytable AS tb JOIN (SELECT tr.ID,GROUP_CONCAT(tr.skill) as skills FROM newschema.train AS tr GROUP BY tr.ID) AS tp ON tb.ID = tp.ID) as final
group by
    final.ct_sex
muk1a3rh

muk1a3rh4#

使用 exists :

select s.sex, count(*)
from students s
where exists (select 1 from training t where t.sid = s.id);

问题在于 join 它是根据每个学生参加培训的次数来计算的。

dw1jzc5e

dw1jzc5e5#

如果你唯一想要的是一个简单的按性别统计,为什么不使用

select sex, count(*)
    from student
    group by sex
    order by sex
dvtswwa3

dvtswwa36#

不知道为什么连接在这里失败,下面的子查询给出了正确的输出。选择sex,count(*)from salary where salary.id in(select sid from training)group by salary.sex;

相关问题