计算配置单元中分区表中不匹配的行

xdnvmnnf 于 2021-06-27 发布在 Hive

关注(0)|答案(1)|浏览(347)

我有一个用例，我必须从一个分区的配置单元表的两个不同分区计算不匹配的行（不包括匹配的记录）。
假设有一个名为test的分区表，它在列上被分区为\u of \u date。现在，为了得到不匹配的行，我尝试了两个选项-1。） select count(x.item_id) from (select coalesce(test_new.item_id, test_old.item_id) as item_id from (select item_id from test where as_of_date = '2019-03-10') test_new full outer join (select item_id from test where as_of_date = '2019-03-09') test_old on test_new.item_id = test_old.item_id where coalesce(test_new.item_id,0) != coalesce(test_old.item_id,0)) as x; 2.）我首先创建一个视图，然后查询该视图 create view test_diff as select coalesce(test_new.item_id, test_old.item_id) as item_id, coalesce(test_new.as_of_date, date_add(test_old.as_of_date, 1)) as as_of_date from test test_new full outer join test test_old on (test_new.item_id = test_old.item_id and date_sub(test_new.as_of_date, 1) = test_old.as_of_date) where coalesce(test_new.item_id,0) != coalesce(test_old.item_id,0); 然后我使用查询 select count(distinct item_id) from test_diff where as_of_date = '2019-03-10'; 两个案子都有不同的结果。在第二种选择中，我得到了较少的计数。请提供任何关于为什么计数不同的建议。

mysql Hive partitioning outer-join

来源：https://stackoverflow.com/questions/55265379/calculating-unmatching-rows-in-partitioned-table-in-hive

1条答案

按热度按时间

ugmeyewa1#

假设您处理了第二个选项中的test\u new、test\u old表（使用as\u of\u date='2019-03-10'进行过滤）。
第一个选项，使用select子句计数（x.item\u id），其中作为第二个选项计数（distinct）。distinct可能会在以后的选项中减少您的项目数。

赞(0）回复(0）举报 2021-06-27

我来回答

计算配置单元中分区表中不匹配的行

1条答案

相关问题

热门标签

最新问答