在配置单元查询执行计划中执行相同操作的不同阶段

mbzjlibv  于 2021-07-13  发布在  Hive
关注(0)|答案(0)|浏览(298)

Hiveversion:1.1.0-cdh5.15.2,我最近开始学习hive源代码及其工作原理。下面是我遇到的问题

  1. explain insert into testv1 select * from test_textfile where val >200;

上面是一个简单的查询,下面是执行计划

  1. STAGE DEPENDENCIES:
  2. Stage-1 is a root stage
  3. Stage-7 depends on stages: Stage-1 , consists of Stage-4, Stage-3, Stage-5
  4. Stage-4
  5. Stage-0 depends on stages: Stage-4, Stage-3, Stage-6
  6. Stage-2 depends on stages: Stage-0
  7. Stage-3
  8. Stage-5
  9. Stage-6 depends on stages: Stage-5
  10. STAGE PLANS:
  11. Stage: Stage-1
  12. Map Reduce
  13. Map Operator Tree:
  14. TableScan
  15. alias: test_textfile
  16. Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE Column stats: NONE
  17. Filter Operator
  18. predicate: (val > 200) (type: boolean)
  19. Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE Column stats: NONE
  20. Select Operator
  21. expressions: UDFToString(val) (type: string)
  22. outputColumnNames: _col0
  23. Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE Column stats: NONE
  24. File Output Operator
  25. compressed: true
  26. Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE Column stats: NONE
  27. table:
  28. input format: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
  29. output format: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat
  30. serde: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
  31. name: test.testv1
  32. Stage: Stage-7
  33. Conditional Operator
  34. Stage: Stage-4
  35. Move Operator
  36. files:
  37. hdfs directory: true
  38. destination: hdfs://xlclusterns1/tmp/hive-stagingdir/staging_hive_2021-04-14_15-14-30_205_4974356220876798617-1/-ext-10000
  39. Stage: Stage-0
  40. Move Operator
  41. tables:
  42. replace: false
  43. table:
  44. input format: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
  45. output format: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat
  46. serde: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
  47. name: test.testv1
  48. Stage: Stage-2
  49. Stats-Aggr Operator
  50. Stage: Stage-3
  51. Map Reduce
  52. Map Operator Tree:
  53. TableScan
  54. File Output Operator
  55. compressed: true
  56. table:
  57. input format: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
  58. output format: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat
  59. serde: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
  60. name: test.testv1
  61. Stage: Stage-5
  62. Map Reduce
  63. Map Operator Tree:
  64. TableScan
  65. File Output Operator
  66. compressed: true
  67. table:
  68. input format: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
  69. output format: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat
  70. serde: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
  71. name: test.testv1
  72. Stage: Stage-6
  73. Move Operator
  74. files:
  75. hdfs directory: true
  76. destination: hdfs://xlclusterns1/tmp/hive-stagingdir/staging_hive_2021-04-14_15-14-30_205_4974356220876798617-1/-ext-10000

问题是我无法解释为什么第三阶段和第五阶段做同样的事情,有人知道这个问题吗?

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题