关于spark和hive sql优化器的查询

ohtdti5x 于 2021-06-26 发布在 Hive

关注(0)|答案(0)|浏览(412)

我对sparkcatalystsql优化器有点困惑（如果有人也能对hive的查询优化器有所了解的话，这会很有用）。下面是一个包含两个子查询q1和q2的查询。如果你密切注意到，除了， predicate 的值 is.true 在这两个子查询中，其他所有内容都是相同的。我的问题是，spark或hive查询优化器是否能够识别这种冗余/相似性，并优化查询以只执行一次洗牌。

select q1.count1, q2.count2 from
(select count(q_id) as count1 from 
(select u.tbl, q_id, max(m.is_true) as is_true from
(select tbl, schema, q_id from umap where a_id=1234) u 
join 
(select distinct schema, table_name, is_true from metadata where id=1234) m 
on u.schema = m.schema and u.tbl = m.table_name 
group by tbl,q_id) p where p.is_true=1) q1,

(select count(q_id) as count2 from 
(select u.tbl, q_id, max(m.is_true) as is_true from
(select tbl, schema, q_id from umap where a_id=1234) u 
join 
(select distinct schema, table_name, is_true from metadata where id=1234) m 
on u.schema = m.schema and u.tbl = m.table_name 
group by tbl,q_id) p where p.is_true=0) q2

谢谢

Hive apache-spark apache-spark-sql hiveql

来源：https://stackoverflow.com/questions/42900852/query-regarding-spark-and-hive-sql-optimizer

暂无答案！

目前还没有任何答案，快来回答吧！

我来回答

关于spark和hive sql优化器的查询

暂无答案！

相关问题

热门标签

最新问答