hive内部连接错误的结果

0g0grzrc  于 2021-06-02  发布在  Hadoop
关注(0)|答案(1)|浏览(288)

两张表表1和表2

hive> select * from  table1 where  dt=20171020;
OK
a   1   1   p   10  20171020
b   2   2   q   10  20171020
c   3   3   r   10  20171020
d   4   4   r   10  20171020

hive> select * from  table2 where  dt=20171020;
OK
a   1   1   p   10  20171020
b   2   2   t   10  20171020
c   3   3   r   10  20171020

hive> select * from table1 t1
    > join table2 t2
    > on t1.c1=t2.c1
    > where
    > t1.dt=20171020 and t2.dt=20171020 and  
    > t1.c2 <> t2.c2 or t1.c3 <> t2.c3 or t1.c4 <> t2.c4  or t1.c5 <> t2.c5;

Result:
a   1   1   p   20  20171016    a   1   1   p   10  20171015
a   1   1   p   20  20171016    a   1   1   p   10  20171020
b   2   2   q   20  20171016    b   2   2   t   10  20171015
b   2   2   q   20  20171016    b   2   2   t   10  20171020
c   3   3   r   20  20171016    c   3   3   r   10  20171015
c   3   3   r   20  20171016    c   3   3   r   10  20171020
b   2   2   q   10  20171020    b   2   2   t   10  20171015
b   2   2   q   10  20171020    b   2   2   t   10  20171020
a   19  19  p   20  20171019    a   1   1   p   10  20171015
a   19  19  p   20  20171019    a   1   1   p   10  20171020

我想跟在后面一排,因为这一排变了,怎么变的 hive 加入上述代码?

b  2   2   q   10  20171020
mgdq6dx1

mgdq6dx11#

试试这个。你的加入也应该是约会。

SELECT *
FROM   table1 t1
       JOIN table2 t2
         ON t1.c1 = t2.c1
            AND t1.dt = t2.dt
WHERE  t1.dt = 20171020
       AND ( t1.c2 <> t2.c2
              OR t1.c3 <> t2.c3
              OR t1.c4 <> t2.c4
              OR t1.c5 <> t2.c5 );

相关问题