Where clause can't push down into the right table of JOIN clasure in Doris, howevery Impala or Mysql can do that.
For example:
SELECT persons.pid, orders.oid, persons.lastname, persons.firstname, orders.ordernum FROM persons left join orders ON persons.pid=orders.oid where orders.oid = 2;
+------------------------------------------------------------------------------------------------------------------------+
| Explain String |
+------------------------------------------------------------------------------------------------------------------------+
| PLAN FRAGMENT F02 |
| OUTPUT EXPRS:persons
.pid
| orders
.oid
| persons
.lastname
| persons
.firstname
| orders
.ordernum
| |
| PARTITION: UNPARTITIONED |
| |
| RESULT SINK |
| |
| 4:EXCHANGE |
| tuple ids: 0 1N |
| |
| PLAN FRAGMENT F00 |
| OUTPUT EXPRS: |
| PARTITION: RANDOM |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 04 |
| UNPARTITIONED |
| |
| 2:HASH JOIN |
| | join op: LEFT OUTER JOIN (BROADCAST) |
| | hash predicates: |
| | colocate: false, reason: table not in same group |
| | persons
.pid
= orders
.oid
|
| | other predicates: orders
.oid
= 2 |
| | tuple ids: 0 1N |
| | |
| |----3:EXCHANGE |
| | tuple ids: 1 |
| | |
| 0:OlapScanNode |
| TABLE: persons |
| PREAGGREGATION: OFF. Reason: No AggregateInfo |
| partitions=1/1 |
| rollup: persons |
| buckets=32/32 |
| cardinality=2 |
| avgRowSize=1172.0 |
| numNodes=1 |
| tuple ids: 0 |
| |
| PLAN FRAGMENT F01 |
| OUTPUT EXPRS: |
| PARTITION: RANDOM |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 03 |
| UNPARTITIONED |
| |
| 1:OlapScanNode |
| TABLE: orders |
| PREAGGREGATION: OFF. Reason: null |
| partitions=1/1 |
| rollup: orders |
| buckets=32/32 |
| cardinality=4 |
| avgRowSize=572.75 |
| numNodes=1 |
| tuple ids: 1 |
+------------------------------------------------------------------------------------------------------------------------+
3条答案
按热度按时间ttp71kqs1#
The where predicate could not be pushed down when the predicate is on the outer join table.
For example:
select * from a left join b on a.id=b.id where b.id=1;
In some cases, the predicate could be pushed down.
In impala, for example:
Query: explain select * from customer_address a left join customer_address b on a.ca_address_sk=b.ca_address_sk where b.ca_address_sk=1
+------------------------------------------------------------------------------------+
| Explain String |
+------------------------------------------------------------------------------------+
| Estimated Per-Host Requirements: Memory=0B VCores=2 |
| WARNING: The following tables are missing relevant table and/or column statistics. |
| tpcds_hive.customer_address |
| |
| PLAN-ROOT SINK |
| | |
| 04:EXCHANGE [UNPARTITIONED] |
| | |
| 02:HASH JOIN [LEFT OUTER JOIN, BROADCAST] |
| | hash predicates: a.ca_address_sk = b.ca_address_sk |
| | other predicates: b.ca_address_sk = 1 |
| | |
| |--03:EXCHANGE [BROADCAST] |
| | | |
| | 01:SCAN HDFS [tpcds_hive.customer_address b] |
| | partitions=1/1 files=0 size=0B |
| | predicates: b.ca_address_sk = 1 |
| | |
| 00:SCAN HDFS [tpcds_hive.customer_address a] |
| partitions=1/1 files=0 size=0B |
+------------------------------------------------------------------------------------+
nnt7mjpx2#
This https://www.ibm.com/developerworks/data/library/techarticle/purcell/0201purcell.html provides a theoretical support for a WHERE clause predicate applied to the NULL-supplying table.
epfja78i3#
In Hive, this behavior is just expected: During Join predicates (ON) cannot be pushed past Preserved Row tables.
After Join predicates (WHERE) cannot be pushed past Null Supplying tables.
https://cwiki.apache.org/confluence/display/Hive/OuterJoinBehavior#OuterJoinBehavior-PredicatePushdownRules