相当于hiveql!>在sql中

v64noz0r  于 2021-06-28  发布在  Hive
关注(0)|答案(1)|浏览(464)

我目前正在尝试从另一个表中不存在的表中提取这些值。但是,由于连接值包含空值,所以notin、notexists和left join选项似乎不起作用。
因此,是否有方法在hiveql中应用“不大于”条件?
作为参考,这是我运行的查询,与not exists和left join类似。。

with date_prob as 
(
    select distinct visit 
    from t1
    where dt=20161124
    and dt1!=orig_ts
),

ev_data as
(
    select distinct visit 
    from t1
    where dt=20161124
    and visit is not null
    and origts is not null 
    and uid is not null
), 

fin_data as 
(
    select x.visit 
    from ev_data x
    where x.visit not in 
    (
      select distinct visit 
      from date_prob
      and visit is not null
    ) 
)

我为左联接运行的查询-

with date_prob as 
(
    select distinct id
    from t1
    where dt1='2016-11-24'
    and dt1!=orig_ts
    and (datediff(dt1,orig_ts) not in ('1','-1'))
),

ev_data as
(
    select distinct id
    from t1
    where dt1='2016-11-24'
    and id is not null
)

select x.id

from ev_data x
left join date_prob y

where y.id is null
;

数据示例-

id        dt1           orig_ts
1     2016-11-24       2016-11-10
2     2016-11-24       2016-11-24 
3     2016-11-24       2010-01-01
4     2016-11-24       2017-01-01
5     2016-11-24       2016-11-24
6     2016-11-24       2016-11-25
7     2016-11-23       2016-11-23

因此,我想从这个表中删除那些大于一天差异的id。因此,查询应该只返回id等于2、5和6的值。

wj8zmpe1

wj8zmpe11#

如果要从不存在于其他表中的表中提取这些值,则可以使用 left join 和过滤器 where second_table_key is null . 即使键中有空值,这也会起作用:

--this query will return records from table a that do not exist in b
select a.id
  from a left join b on a.id=b.id
 where b.id is null; --only not joined

我已经修正了你的例子。它的工作原理是:

drop table if exists t1;
create table t1 (id int,dt1 string,           orig_ts string );
insert overwrite table t1
select 1 id,    '2016-11-24' dt1,       '2016-11-10' orig_ts union all
select 2 id,    '2016-11-24' dt1,       '2016-11-24' orig_ts union all 
select 3 id,    '2016-11-24' dt1,       '2010-01-01' orig_ts union all
select 4 id,    '2016-11-24' dt1,       '2017-01-01' orig_ts union all
select 5 id,    '2016-11-24' dt1,       '2016-11-24' orig_ts union all
select 6 id,    '2016-11-24' dt1,       '2016-11-25' orig_ts union all
select 7 id,    '2016-11-23' dt1,       '2016-11-23' orig_ts;

with date_prob as 
(
    select distinct id
    from t1
    where dt1='2016-11-24'
    and dt1!=orig_ts
    and (datediff(dt1,orig_ts) not in ('1','-1'))
),

ev_data as
(
    select distinct id
    from t1
    where dt1='2016-11-24'
    and id is not null
)

select x.id
from ev_data x
left join date_prob y on x.id=y.id
where y.id is null
;

OK
2
5
6
Time taken: 14.166 seconds, Fetched: 3 row(s)
hive>

按预期工作

相关问题