我目前正在尝试从另一个表中不存在的表中提取这些值。但是,由于连接值包含空值,所以notin、notexists和left join选项似乎不起作用。
因此,是否有方法在hiveql中应用“不大于”条件?
作为参考,这是我运行的查询,与not exists和left join类似。。
with date_prob as
(
select distinct visit
from t1
where dt=20161124
and dt1!=orig_ts
),
ev_data as
(
select distinct visit
from t1
where dt=20161124
and visit is not null
and origts is not null
and uid is not null
),
fin_data as
(
select x.visit
from ev_data x
where x.visit not in
(
select distinct visit
from date_prob
and visit is not null
)
)
我为左联接运行的查询-
with date_prob as
(
select distinct id
from t1
where dt1='2016-11-24'
and dt1!=orig_ts
and (datediff(dt1,orig_ts) not in ('1','-1'))
),
ev_data as
(
select distinct id
from t1
where dt1='2016-11-24'
and id is not null
)
select x.id
from ev_data x
left join date_prob y
where y.id is null
;
数据示例-
id dt1 orig_ts
1 2016-11-24 2016-11-10
2 2016-11-24 2016-11-24
3 2016-11-24 2010-01-01
4 2016-11-24 2017-01-01
5 2016-11-24 2016-11-24
6 2016-11-24 2016-11-25
7 2016-11-23 2016-11-23
因此,我想从这个表中删除那些大于一天差异的id。因此,查询应该只返回id等于2、5和6的值。
1条答案
按热度按时间wj8zmpe11#
如果要从不存在于其他表中的表中提取这些值,则可以使用
left join
和过滤器where second_table_key is null
. 即使键中有空值,这也会起作用:我已经修正了你的例子。它的工作原理是:
按预期工作