oracle 要在值之间插值的SQL查询

xxls0lw8  于 2023-05-06  发布在  Oracle
关注(0)|答案(3)|浏览(156)

我打算在列中的值之间进行插值(线性插值),并使用SQL查询将其插入新列。根据我在网上的搜索,我怀疑铅分析功能可能是有用的。我是SQL查询的新手。因此,任何关于如何实现这一目标的见解都将非常有帮助。
样本数据集如下所述:

Emp  Test_date  Value
---  ---------  -----
A    1/1/2001   null
A    1/2/2001   100
A    1/3/2001   null
A    1/4/2001   80
A    1/5/2001   null
A    1/6/2001   null
A    1/7/2001   75

我们的想法是获得第四列,其值为:

null
100
interpolatedValue1
80
interpolatedValue2
interpolatedValue3
75
  • Interpolatedvalue 1 * 将是10080之间的内插值,
  • Interpolatedvalue 2 * 将是8075之间的线性插值。
  • InterpolatedValue 3 * 将是InterpolatedValue 275之间的线性插值
    以下是简单的线性插值的工作原理:

给定两个点(V1atD1),(V3atD3)。* * D2处的值V2**是多少?

(V3-V1)/(D3-D1)*(D2-D1)+ V1

w1e3prcc

w1e3prcc1#

这可能会被简化一点,但得到你想要的答案,我相信。稍微有点棘手的是获得非空值之间的天数(即你正在填充差距的大小),然后是在该间隙内的位置:

-- CTE for sample data
with your_table (emp, test_date, value) as (
            select 'A', date '2001-01-01', null from dual
  union all select 'A', date '2001-01-02', 100 from dual
  union all select 'A', date '2001-01-03', null from dual
  union all select 'A', date '2001-01-04', 80 from dual
  union all select 'A', date '2001-01-05', null from dual
  union all select 'A', date '2001-01-06', null from dual
  union all select 'A', date '2001-01-07', 75 from dual
)
-- actual query
select emp, test_date, value,
  coalesce(value,
    (next_value - prev_value) -- v3-v1
    / (count(*) over (partition by grp) + 1) -- d3-d1
    * row_number() over (partition by grp order by test_date desc) -- d2-d1, indirectly
    + prev_value -- v1
  ) as interpolated
from (
  select emp, test_date, value,
    last_value(value ignore nulls)
      over (partition by emp order by test_date) as prev_value,
    first_value(value ignore nulls)
      over (partition by emp order by test_date range between current row and unbounded following) as next_value,
    row_number() over (partition by emp order by test_date) -
      row_number() over (partition by emp order by case when value is null then 1 else 0 end, test_date) as grp
  from your_table
)
order by test_date;
E TEST_DATE       VALUE INTERPOLATED
- ---------- ---------- ------------
A 2001-01-01                        
A 2001-01-02        100          100
A 2001-01-03                      90
A 2001-01-04         80           80
A 2001-01-05              76.6666667
A 2001-01-06              78.3333333
A 2001-01-07         75           75

我使用了last_valuefirst_value,而不是leadlag,但这两种方法都可以工作。(我想,在大数据集上,超前/滞后可能会更快)。grp的计算是Tabibitosan。

50pmv0ei

50pmv0ei2#

可以使用lag(ignore nulls)。您不指定如何进行插值,但线性插值将是:

select emp, test_date,
       coalesce(test_value,
                ( next_tv * (next_td - test_date) +
                  prev_tv * (test_date - prev_td)
                ) / (next_td - prev_td)
               ) as imputed_value
from (select t.*,
             lag(test_value ignore nulls) over (partition by emp order by test_date) as prev_tv,
             lag(case when test_value is not null then test_date end ignore nulls) over (partition by emp order by test_date) as prev_td,
             lead(test_value ignore nulls) over (partition by emp order by test_date) as next_tv,
             lead(case when test_value is not null then test_date end ignore nulls) over (partition by emp order by test_date) as next_td
      from t
     ) t
u4vypkhs

u4vypkhs3#

下面是使用ASC插值对SQLImpala进行修复

select emp, test_date, value,
  coalesce(value,
    (next_value - prev_value) -- v3-v1
    / (count(*) over (partition by grp) + 1) -- d3-d1
    * row_number() over (partition by grp order by test_date desc) -- d2-d1, indirectly
    + prev_value -- v1
  ) as interpolated
from (
  select emp, test_date, value,
    last_value(value ignore nulls)
      over (partition by emp order by test_date) as prev_value,
    first_value(value ignore nulls)
      over (partition by emp order by test_date range between current row and unbounded following) as next_value,
    row_number() over (partition by emp order by test_date) -
      row_number() over (partition by emp order by case when value is null then 1 else 0 end, test_date) as grp
  from test.table
) t

order by test_date

相关问题