sql—比较两个配置单元表之间的计数

thtygnil  于 2021-05-29  发布在  Hadoop
关注(0)|答案(2)|浏览(454)

我正试着在两张table之间做计数比较。因为减号运算符在Hive中不起作用,所以它不会发生。你能给我一些简单的方法来比较两张表的计数吗。

select  'Call Detail - Hive T1 to HDFS Staging - Data Compare',
case when cnt>0 then 'Fail' Else 'Pass' end
from
(select count(*) cnt from (
(select 
count(*) from students1 s1)-
(select count(*) from students2 s2)
) as tbl1
) as tbl2;

这是一个错误:
失败:parseexception行81:0无法识别源中“(''select')附近的输入

ubof19bj

ubof19bj1#

查看下面的查询。。我的系统在本地运行良好。如果有用请告诉我。

select 'Call Detail - Hive T1 to HDFS Staging - Data Compare',
       case 
       when (sum(cnt1) - sum(cnt2)) > 0 
       then 'PASS' 
       else 'FAIL' 
       end as count_records
  from (select count(*) as cnt1, 0 as cnt2 from students1 
        union all
        select 0 as cnt1, count(*) as cnt2 from students1  ) tbl;
ztmd8pv5

ztmd8pv52#

使用 cross join 如果没有按列分组。在这种情况下,它将生成一行,其中包含两个计数:

select s.cnt-s1.cnt diff, case when abs(s.cnt-s1.cnt) > 0 then 'Fail' Else 'Pass' end result
from
(select count(*) cnt  from students1 s1) s
cross join
(select count(*) cnt from students2 s2) s1

如果您将添加一些groupby列来比较更详细的粒度,那么使用 FULL JOIN 按列分组:

select s.col1 s_col1, s1.col1 s1_col1, s.cnt-s1.cnt diff, case when abs(s.cnt-s1.cnt) > 0 then 'Fail' Else 'Pass' end result
from
(select count(*) cnt, col1  from students1 s1 group by col1) s
full join
(select count(*) cnt, col1 from students2 s2 group by col1) s1 
on s.col1 = s1.col1

此查询将返回两个表中已计算差异的联接行和未联接行。

相关问题