sql如果缺少日期,如何填写时间序列的最后一个值

r55awzrz  于 2021-07-26  发布在  Java
关注(0)|答案(1)|浏览(322)

我得到了下表。

  1. WITH
  2. -- your input ....
  3. input(t,grp,value) AS (
  4. SELECT TIMESTAMP '2020-05-28 00:00:00','A',55
  5. UNION ALL SELECT TIMESTAMP '2020-05-28 00:00:00','B',1.09
  6. UNION ALL SELECT TIMESTAMP '2020-05-28 00:00:00','C',1.8
  7. UNION ALL SELECT TIMESTAMP '2020-05-29 00:00:00','A',68
  8. UNION ALL SELECT TIMESTAMP '2020-05-29 00:00:00','B',1.9
  9. UNION ALL SELECT TIMESTAMP '2020-05-29 00:00:00','C',1.19
  10. UNION ALL SELECT TIMESTAMP '2020-06-01 00:00:00','A',10
  11. UNION ALL SELECT TIMESTAMP '2020-06-01 00:00:00','B',15
  12. UNION ALL SELECT TIMESTAMP '2020-06-01 00:00:00','C',0.88
  13. UNION ALL SELECT TIMESTAMP '2020-06-02 00:00:00','A',22
  14. UNION ALL SELECT TIMESTAMP '2020-06-02 00:00:00','B',15
  15. UNION ALL SELECT TIMESTAMP '2020-06-02 00:00:00','C',13
  16. UNION ALL SELECT TIMESTAMP '2020-06-03 00:00:00','A',66
  17. UNION ALL SELECT TIMESTAMP '2020-06-03 00:00:00','B',88
  18. UNION ALL SELECT TIMESTAMP '2020-06-03 00:00:00','C',99
  19. )

如您所见,本表中缺少2020-30-05和2020-31-05之间的日期。因此,有必要在这些日期填写2020-29-05信息分组。此外,今天的日期比数据中的日期(06-03 vs 06-08)要大(因此,在本月,这些观察值丢失了)。因此,最终输出应如下所示:

  1. date2 Group number
  2. 2020-28-05 00:00:00 A 55
  3. 2020-28-05 00:00:00 B 1.09
  4. 2020-28-05 00:00:00 C 1.8
  5. 2020-29-05 00:00:00 A 68
  6. 2020-29-05 00:00:00 B 1.9
  7. 2020-29-05 00:00:00 C 1.19
  8. 2020-30-05 00:00:00 A 68
  9. 2020-30-05 00:00:00 B 1.9
  10. 2020-30-05 00:00:00 C 1.19
  11. 2020-31-05 00:00:00 A 68
  12. 2020-31-05 00:00:00 B 1.9
  13. 2020-31-05 00:00:00 C 1.19
  14. 2020-01-06 00:00:00 A 10
  15. 2020-01-06 00:00:00 B 15
  16. 2020-01-06 00:00:00 C 0.88
  17. 2020-02-06 00:00:00 A 22
  18. 2020-02-06 00:00:00 B 15
  19. 2020-02-06 00:00:00 C 13
  20. 2020-03-06 00:00:00 A 66
  21. 2020-03-06 00:00:00 B 88
  22. 2020-03-06 00:00:00 C 99
  23. And for periods 03-06 till 08-06 the same values
  24. 2020-08-06 00:00:00 A 66
  25. 2020-08-06 00:00:00 B 88
  26. 2020-08-06 00:00:00 C 99

下面的代码有助于查找日期中缺少的值,但是这些间隙不会在今天的日期中被填补。如何修复?

  1. SELECT ts AS t, grp, TS_FIRST_VALUE(value,'const') AS value
  2. FROM input
  3. TIMESERIES ts AS '1 DAY' OVER(PARTITION BY grp ORDER BY t)
  4. ORDER BY 1,2
kqhtkvqz

kqhtkvqz1#

它叫 INTERPOLATE 而不是外推,这就是挑战。
您需要将每组的最后一行添加到输入表中,但要使用今天的日期而不是实际/原始日期。注意 padding 以及 padded 我在下面使用的常用表表达式。vertica有我在这里使用的分析极限子句: LIMIT 1 OVER(PARTITION BY grp ORDER BY tmstmp DESC) ..

  1. WITH
  2. input(tmstmp,grp,nbr) AS (
  3. SELECT TIMESTAMP '2020-05-28 00:00:00','A',55
  4. UNION ALL SELECT TIMESTAMP '2020-05-28 00:00:00','B',1.09
  5. UNION ALL SELECT TIMESTAMP '2020-05-28 00:00:00','C',1.8
  6. UNION ALL SELECT TIMESTAMP '2020-05-29 00:00:00','A',68
  7. UNION ALL SELECT TIMESTAMP '2020-05-29 00:00:00','B',1.9
  8. UNION ALL SELECT TIMESTAMP '2020-05-29 00:00:00','C',1.19
  9. UNION ALL SELECT TIMESTAMP '2020-06-01 00:00:00','A',10
  10. UNION ALL SELECT TIMESTAMP '2020-06-01 00:00:00','B',15
  11. UNION ALL SELECT TIMESTAMP '2020-06-01 00:00:00','C',0.88
  12. UNION ALL SELECT TIMESTAMP '2020-06-02 00:00:00','A',22
  13. UNION ALL SELECT TIMESTAMP '2020-06-02 00:00:00','B',15
  14. UNION ALL SELECT TIMESTAMP '2020-06-02 00:00:00','C',13
  15. UNION ALL SELECT TIMESTAMP '2020-06-03 00:00:00','A',66
  16. UNION ALL SELECT TIMESTAMP '2020-06-03 00:00:00','B',88
  17. UNION ALL SELECT TIMESTAMP '2020-06-03 00:00:00','C',99
  18. )
  19. ,
  20. padding AS (
  21. SELECT
  22. CURRENT_DATE::timestamp
  23. , grp
  24. , nbr
  25. FROM input
  26. LIMIT 1 OVER(PARTITION BY grp ORDER BY tmstmp DESC)
  27. )
  28. ,
  29. padded AS (
  30. SELECT * FROM input
  31. UNION ALL
  32. SELECT * FROM padding
  33. )
  34. SELECT
  35. ts AS tmstmp
  36. , grp
  37. , TS_FIRST_VALUE(nbr,'const') AS nbr
  38. FROM padded
  39. TIMESERIES ts AS '1 DAY' OVER(PARTITION BY grp ORDER BY tmstmp)
  40. ORDER BY 1,2
  41. ;
  42. -- out tmstmp | grp | nbr
  43. -- out ---------------------+-----+-------
  44. -- out 2020-05-28 00:00:00 | A | 55.00
  45. -- out 2020-05-28 00:00:00 | B | 1.09
  46. -- out 2020-05-28 00:00:00 | C | 1.80
  47. -- out 2020-05-29 00:00:00 | A | 68.00
  48. -- out 2020-05-29 00:00:00 | B | 1.90
  49. -- out 2020-05-29 00:00:00 | C | 1.19
  50. -- out 2020-05-30 00:00:00 | A | 68.00
  51. -- out 2020-05-30 00:00:00 | B | 1.90
  52. -- out 2020-05-30 00:00:00 | C | 1.19
  53. -- out 2020-05-31 00:00:00 | A | 68.00
  54. -- out 2020-05-31 00:00:00 | B | 1.90
  55. -- out 2020-05-31 00:00:00 | C | 1.19
  56. -- out 2020-06-01 00:00:00 | A | 10.00
  57. -- out 2020-06-01 00:00:00 | B | 15.00
  58. -- out 2020-06-01 00:00:00 | C | 0.88
  59. -- out 2020-06-02 00:00:00 | A | 22.00
  60. -- out 2020-06-02 00:00:00 | B | 15.00
  61. -- out 2020-06-02 00:00:00 | C | 13.00
  62. -- out 2020-06-03 00:00:00 | A | 66.00
  63. -- out 2020-06-03 00:00:00 | B | 88.00
  64. -- out 2020-06-03 00:00:00 | C | 99.00
  65. -- out 2020-06-04 00:00:00 | A | 66.00
  66. -- out 2020-06-04 00:00:00 | B | 88.00
  67. -- out 2020-06-04 00:00:00 | C | 99.00
  68. -- out 2020-06-05 00:00:00 | A | 66.00
  69. -- out 2020-06-05 00:00:00 | B | 88.00
  70. -- out 2020-06-05 00:00:00 | C | 99.00
  71. -- out 2020-06-06 00:00:00 | A | 66.00
  72. -- out 2020-06-06 00:00:00 | B | 88.00
  73. -- out 2020-06-06 00:00:00 | C | 99.00
  74. -- out 2020-06-07 00:00:00 | A | 66.00
  75. -- out 2020-06-07 00:00:00 | B | 88.00
  76. -- out 2020-06-07 00:00:00 | C | 99.00
  77. -- out 2020-06-08 00:00:00 | A | 66.00
  78. -- out 2020-06-08 00:00:00 | B | 88.00
  79. -- out 2020-06-08 00:00:00 | C | 99.00
  80. -- out 2020-06-09 00:00:00 | A | 66.00
  81. -- out 2020-06-09 00:00:00 | B | 88.00
  82. -- out 2020-06-09 00:00:00 | C | 99.00
展开查看全部

相关问题