雪花sql查询左联接问题

c9x0cxw0  于 2021-07-26  发布在  Java
关注(0)|答案(1)|浏览(511)

对于我们的代码之一,left join在snowflake中的行为不正常。如果你能找到同样的解决办法,就需要你的帮助。
我们有一个示例数据设置,如下面提到的basicc table join。

CREATE TABLE patient_test(pid INT);
INSERT INTO patient_test (pid) VALUES (100);

CREATE TABLE pateint_entry_test (pid INT,DateAdded DATETIME);
INSERT INTO pateint_entry_test (pid, DateAdded) VALUES (100, '2020-07-13');

现在看下面的代码,我只是给你一个示例子查询,我们正在使用其他查询集。我们的动机是根据给定的开始/结束日期为每位患者输入日期。

WITh patient_cte  AS(
          SELECT * FROM patient_test
      )
      ,
      dates AS(
       SELECT  DATEDIFF(day, CONVERT_TIMEZONE('EST', 'UTC', CAST(TO_TIMESTAMP('2020-07-06') AS TIMESTAMP_NTZ)),
                            CONVERT_TIMEZONE('EST', 'UTC', CAST(TO_TIMESTAMP('2020-07-12') AS TIMESTAMP_NTZ))) AS Total_Days,
                            CONVERT_TIMEZONE('EST', 'UTC', CAST(TO_TIMESTAMP('2020-07-06') AS TIMESTAMP_NTZ)) AS Start_Date,
                            CONVERT_TIMEZONE('EST', 'UTC', CAST(TO_TIMESTAMP('2020-07-12') AS TIMESTAMP_NTZ)) AS end_date
      )
      ,
      cte2 (date) as (
      SELECT TO_DATE(START_DATE) FROM dates
      UNION ALL
      SELECT TO_DATE(DATEADD(day, 1, date)) FROM cte2 WHERE date < (SELECT TOP 1 END_DATE FROM dates)
      ),
      cte3 AS (
          select * from patient_cte
              cross join cte2
      )

      SELECT cte3.pid as p_pid,
        pateint_entry_test.pid as p_entry_pid,
        pateint_entry_test.DateAdded,
        cte3."DATE" ,
        IFNULL( pateint_entry_test.DateAdded, cte3."DATE") AS CALCULATEDDATEMEASURED
     FROM cte3
        LEFT JOIN pateint_entry_test ON
            cte3.pid = pateint_entry_test.pid AND
            cte3."DATE" = TO_DATE(pateint_entry_test.DateAdded)

查询的输出结果如下。

你能看到的地方 CALCULATEDDATEMEASURED 第2行到第7行 2020-07-06 00:00:00 . 但作为 DAETADDED 为空,则应根据 DATE 列值(基于此条件 IFNULL( pateint_entry_test.DateAdded, cte3."DATE") )
查询应有以下输出

不知道出了什么问题,但它的行为不符合预期。谢谢你在这方面的帮助。谢谢您。

cnh2zyt3

cnh2zyt31#

我不确定这是否是一个bug,但这是由于基于您编写查询的方式的类型强制。以下是您的查询,在ifnull语句中应用的截止日期逻辑与在join中应用的逻辑相同(同时还有一个coalesce来显示它生成的结果与ifnull相同):

WITh patient_cte  AS(
          SELECT * FROM patient_test
      )
      ,
      dates AS(
       SELECT  DATEDIFF(day, CONVERT_TIMEZONE('EST', 'UTC', CAST(TO_TIMESTAMP('2020-07-06') AS TIMESTAMP_NTZ)),
                            CONVERT_TIMEZONE('EST', 'UTC', CAST(TO_TIMESTAMP('2020-07-12') AS TIMESTAMP_NTZ))) AS Total_Days,
                            CONVERT_TIMEZONE('EST', 'UTC', CAST(TO_TIMESTAMP('2020-07-06') AS TIMESTAMP_NTZ)) AS Start_Date,
                            CONVERT_TIMEZONE('EST', 'UTC', CAST(TO_TIMESTAMP('2020-07-12') AS TIMESTAMP_NTZ)) AS end_date
      )
      ,
      cte2 (date) as (
      SELECT TO_DATE(START_DATE) FROM dates
      UNION ALL
      SELECT TO_DATE(DATEADD(day, 1, date)) FROM cte2 WHERE date < (SELECT TOP 1 END_DATE FROM dates)
      ),
      cte3 AS (
          select * from patient_cte 
              cross join cte2 
      )
      SELECT cte3.pid as p_pid,
        pateint_entry_test.pid as p_entry_pid,
        pateint_entry_test.DateAdded,
        cte3."DATE",
        IFNULL( pateint_entry_test.DateAdded, cte3."DATE") AS ORIGINAL_ERROR,
        IFNULL( to_date(pateint_entry_test.DateAdded), cte3."DATE") AS CALCULATEDDATEMEASURED,
        coalesce(to_date(pateint_entry_test.DateAdded), cte3."DATE") as from_coalesce
     FROM cte3 
        LEFT JOIN pateint_entry_test 
            ON cte3.pid = pateint_entry_test.pid 
            AND cte3."DATE" = to_date(pateint_entry_test.DateAdded);

在snowflake中运行它会产生以下结果:

相关问题