oracle 如何生成日期时间值以填补数据空白

ibps3vxo  于 2023-10-16  发布在  Oracle
关注(0)|答案(2)|浏览(128)

我试图填补气象站数据表中的空白,问题是我的表没有空单元格用于数据丢失的地方,它只是从2023-04-07 10:30跳到2023-04-07 16:30。我原来有Excel文件,但我已导入此数据到Oracle SQL数据库,我正试图使用SQL开发人员填写的值。
这是我运行的SQL代码,

WITH DateSequence AS (
    SELECT
        TRUNC(MIN(Time), 'HH24') + (LEVEL - 1) / 48 AS generated_datetime
    FROM
        WeatherData
    CONNECT BY
        LEVEL <= (TRUNC(MAX(Time), 'HH24') - TRUNC(MIN(Time), 'HH24')) * 48 + 1
)
SELECT
    ds.generated_datetime
FROM
    DateSequence ds
LEFT JOIN
    WeatherData wd  -- Corrected alias here
ON
    ds.generated_datetime = wd.Time  -- Corrected alias here
WHERE
    wd.Time IS NULL
ORDER BY
    ds.generated_datetime;

有没有更简单的方法在Excel中使用PowerBI?或者是一个简单的SQL查询?

gv8xihay

gv8xihay1#

此查询将首先生成一个日期和时间序列,从数据集中的最小时间开始,到最大时间结束。然后,它将使用时间列作为连接条件,将此序列连接到数据集。对于序列中在数据集中没有对应行的任何行,温度和湿度列将设置为0。查询的结果将按时间列排序。

SELECT
  generated_datetime AS time,
  CASE
    WHEN wd.temperature IS NULL THEN 0
    ELSE wd.temperature
  END AS temperature,
  CASE
    WHEN wd.humidity IS NULL THEN 0
    ELSE wd.humidity
  END AS humidity,
  ...
FROM (
  SELECT
    TRUNC(MIN(time), 'HH24') + (LEVEL - 1) / 48 AS generated_datetime
  FROM weather_data
  CONNECT BY LEVEL <= (TRUNC(MAX(time), 'HH24') - TRUNC(MIN(time), 'HH24')) * 48 + 1
) ds
LEFT JOIN weather_data wd
ON ds.generated_datetime = wd.time
ORDER BY ds.generated_datetime;
  • Excel方法 *

您还可以使用Power Query(在Excel或Power BI中)来填补数据中的空白。为此,您可以使用以下步骤:
在Power BI中通过单击“度量”窗格中的“新建度量”按钮创建新度量。在“公式编辑器”窗口中,输入以下公式:

= IF(ISBLANK([Time]), 0, [Time])

1.单击“OK”按钮以保存测量。
1.在Power BI中创建一个新的表格或图表,并添加以下字段:
1.时间温度湿度
1.新度量(在步骤2中创建)
将度量聚合更改为“Average”。这将创建一个表格或图表,显示每个时间段的平均温度和湿度,即使数据中存在差距。

gev0vcfq

gev0vcfq2#

在Oracle中,使用PARTITION艾德OUTER JOIN返回所有值,包括缺失值:

WITH DateSequence (time) AS (
  SELECT min_time + (LEVEL - 1) * INTERVAL '30' MINUTE
  FROM   (
    SELECT TRUNC(MIN(Time), 'HH24') AS min_time,
           TRUNC(MAX(Time), 'HH24') As max_time
    FROM   WeatherData
  )
  CONNECT BY min_time + (LEVEL - 1) * INTERVAL '30' MINUTE <= max_time
)
SELECT wd.station_id,
       ds.time,
       wd.temperature,
       wd.windspeed,
       wd.rainfall
FROM   DateSequence ds
       LEFT JOIN WeatherData wd
       PARTITION BY (wd.station_id)
       ON ds.time = wd.Time;

其中,对于样本数据:

CREATE TABLE WeatherData (station_id, time, temperature, windspeed, rainfall) AS
  SELECT 1, DATE '1970-01-01' + INTERVAL '7:00' HOUR TO MINUTE, 15,  5,  3 FROM DUAL UNION ALL
  SELECT 1, DATE '1970-01-01' + INTERVAL '7:30' HOUR TO MINUTE, 16,  4,  0 FROM DUAL UNION ALL
  SELECT 1, DATE '1970-01-01' + INTERVAL '9:00' HOUR TO MINUTE, 17,  3,  0 FROM DUAL UNION ALL
  SELECT 2, DATE '1970-01-01' + INTERVAL '7:00' HOUR TO MINUTE, 14, 10, 10 FROM DUAL UNION ALL
  SELECT 2, DATE '1970-01-01' + INTERVAL '8:30' HOUR TO MINUTE, 19,  6, 20 FROM DUAL UNION ALL
  SELECT 3, DATE '1970-01-01' + INTERVAL '8:00' HOUR TO MINUTE, 12, 30,  8 FROM DUAL;

输出:
| 站点ID|时间|温度|风速|降雨|
| --|--|--|--|--|
| 1 |1970-01-01 07:00:00| 15 | 5 | 3 |
| 1 |1970-01-01 07:30:00| 16 | 4 | 0 |
| 1 |1970-01-01 08:00:00| * 空 || 空 *|
| 1 |1970-01-01 08:30:00| * 空 || 空 *|
| 1 |1970-01-01 09:00:00| 17 | 3 | 0 |
| 2 |1970-01-01 07:00:00| 14 | 10 | 10 |
| 2 |1970-01-01 07:30:00| * 空 || 空 *|
| 2 |1970-01-01 08:00:00| * 空 || 空 *|
| 2 |1970-01-01 08:30:00| 19 | 6 | 20 |
| 2 |1970-01-01 09:00:00| * 空 || 空 *|
| 3 |1970-01-01 07:00:00| * 空 || 空 *|
| 3 |1970-01-01 07:30:00| * 空 || 空 *|
| 3 |1970-01-01 08:00:00| 12 | 30 | 8 |
| 3 |1970-01-01 08:30:00| * 空 || 空 *|
| 3 |1970-01-01 09:00:00| * 空 || 空 *|
如果你想用0替换缺失的值,那么就用COALESCE

WITH DateSequence (time) AS (
  SELECT min_time + (LEVEL - 1) * INTERVAL '30' MINUTE
  FROM   (
    SELECT TRUNC(MIN(Time), 'HH24') AS min_time,
           TRUNC(MAX(Time), 'HH24') As max_time
    FROM   WeatherData
  )
  CONNECT BY min_time + (LEVEL - 1) * INTERVAL '30' MINUTE <= max_time
)
SELECT wd.station_id,
       ds.time,
       COALESCE(wd.temperature, 0) AS temperature,
       COALESCE(wd.windspeed, 0) AS windspeed,
       COALESCE(wd.rainfall, 0) AS rainfall
FROM   DateSequence ds
       LEFT JOIN WeatherData wd
       PARTITION BY (wd.station_id)
       ON ds.time = wd.Time;

其输出:
| 站点ID|时间|温度|风速|降雨|
| --|--|--|--|--|
| 1 |1970-01-01 07:00:00| 15 | 5 | 3 |
| 1 |1970-01-01 07:30:00| 16 | 4 | 0 |
| 1 |1970-01-01 08:00:00| 0 | 0 | 0 |
| 1 |1970-01-01 08:30:00| 0 | 0 | 0 |
| 1 |1970-01-01 09:00:00| 17 | 3 | 0 |
| 2 |1970-01-01 07:00:00| 14 | 10 | 10 |
| 2 |1970-01-01 07:30:00| 0 | 0 | 0 |
| 2 |1970-01-01 08:00:00| 0 | 0 | 0 |
| 2 |1970-01-01 08:30:00| 19 | 6 | 20 |
| 2 |1970-01-01 09:00:00| 0 | 0 | 0 |
| 3 |1970-01-01 07:00:00| 0 | 0 | 0 |
| 3 |1970-01-01 07:30:00| 0 | 0 | 0 |
| 3 |1970-01-01 08:00:00| 12 | 30 | 8 |
| 3 |1970-01-01 08:30:00| 0 | 0 | 0 |
| 3 |1970-01-01 09:00:00| 0 | 0 | 0 |
fiddle

相关问题