如何确定执行配置单元查询所需的作业总数

kb5ga3dv  于 2021-05-27  发布在  Hadoop
关注(0)|答案(1)|浏览(308)

是否有方法确定执行查询所需的作业总数。
例如,在下面的2个查询中,联接数和子查询数相同,但一个查询需要2个作业,而另一个查询需要3个作业

select t1.item_dim_key hive, t2.item_dim_key as monet 
   from ext_dist_it_dim_key t1 
        left outer join (select distinct item_dim_key from PO_ITEM_DIM) t2 on t1.item_dim_key=t2.item_dim_key 
   where t2.item_dim_key is null;

WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = 20190208020329_258ee4c0-5819-4842-b479-d549c82a0779

**Total jobs = 3**

hive> select t1.item_dim_key hive, t2.item_dim_key as monet 
       from (select distinct item_dim_key from PO_ITEM_DIM) t1 
            left outer join ext_dist_it_dim_key t2 on t1.item_dim_key=t2.item_dim_key 
      where t2.item_dim_key is null;

WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = 20190208020624_9ea3dc20-ffc8-4461-9516-7a4770d1dd6b

**Total jobs = 2**

是否可以知道执行一个查询需要多少个作业?计算作业数所需的参数是什么。
谢谢

ngynwnxp

ngynwnxp1#

使用explain,它显示查询执行计划。只有计划才能有助于肯定地回答这个问题。基于统计信息或表(文件)大小,优化器可以将一些连接转换为Map连接等。

相关问题