我正在尝试编写一个配置单元查询,如果当前字段中的值为null,该查询将从同一列中的前一行复制字段的值。如果当前值不为空,则应保留该值。例如,如果我有以下输入:
company empId first_name last_ame job_code department start_date
110 500400 ABC XYZ 300 101 01/20/2015
110 500400 Null Null 305 105 04/02/2015
110 500400 ABC1 Null Null Null 15/02/1015
110 500400 Null XYZ1 307 Null 01/03/2015
输出应该是这样的:
company empId first_name last_name job_code department start_date
110 500400 ABC XYZ 300 101 01/20/2015
110 500400 ABC XYZ 305 105 04/02/2015
110 500400 ABC1 XYZ 305 105 15/02/1015
110 500400 ABC1 XYZ1 307 105 01/03/2015
我尝试了使用last\u value和lag函数的查询,但似乎都不起作用。对于last_值,它仅在行数有限时工作。当我在一个大数据集上运行它时,它失败了(map red没有完成)。这是我正在尝试的问题:
select
company, empId, start_date,
last_value(last_name, true) over (partition by company, empId order by start_date) as last_name,
last_value(first_name, true) over (partition by company, empId order by start_date) as first_name,
last_value(department, true) over (partition by company, empId order by start_date) as department,
last_value(job_code, true) over(partition by company,empId order by start_date) as job_code from samples.z_sample_test order by start_date;
对于lag,只有一个记录得到更新。不会更新所有后续记录。这是我正在使用的查询:
select
c.company,
c.empId,
c.start_date,
if(c.first_name is null, lag(c.first_name, 1) over (order by c.start_date), c.first_name) as first_name,
if(c.last_name is null, lag(c.last_name, 1) over (order by c.start_date), c.last_name) as last_name,
if(c.job_code is null, lag(c.job_code, 1) over (order by c.start_date), c.job_code) as job_code,
if(c.department is null, lag(c.department, 1) over (order by c.start_date), c.department) as department
from samples.z_sample_test c
left join samples.z_sample_test p
on (c.company = p.company and c.empId = p.empId)
group by c.company, c.employee, c.start_date, c.last_name, c.first_name, c.job_code, c.department order by c.start_date;
我很感激你在这件事上的帮助。
暂无答案!
目前还没有任何答案,快来回答吧!