关于spark和hive sql优化器的查询

ohtdti5x  于 2021-06-26  发布在  Hive
关注(0)|答案(0)|浏览(412)

我对sparkcatalystsql优化器有点困惑(如果有人也能对hive的查询优化器有所了解的话,这会很有用)。下面是一个包含两个子查询q1和q2的查询。如果你密切注意到,除了, predicate 的值 is.true 在这两个子查询中,其他所有内容都是相同的。我的问题是,spark或hive查询优化器是否能够识别这种冗余/相似性,并优化查询以只执行一次洗牌。

select q1.count1, q2.count2 from
(select count(q_id) as count1 from 
(select u.tbl, q_id, max(m.is_true) as is_true from
(select tbl, schema, q_id from umap where a_id=1234) u 
join 
(select distinct schema, table_name, is_true from metadata where id=1234) m 
on u.schema = m.schema and u.tbl = m.table_name 
group by tbl,q_id) p where p.is_true=1) q1,

(select count(q_id) as count2 from 
(select u.tbl, q_id, max(m.is_true) as is_true from
(select tbl, schema, q_id from umap where a_id=1234) u 
join 
(select distinct schema, table_name, is_true from metadata where id=1234) m 
on u.schema = m.schema and u.tbl = m.table_name 
group by tbl,q_id) p where p.is_true=0) q2

谢谢

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题