在postgresql中优雅地处理数据时间

ctzwtxfj  于 2023-06-05  发布在  PostgreSQL
关注(0)|答案(1)|浏览(396)

我有下面的代码,给我这个错误
错误:“YYYY”的值“test”无效
代码:

select * from
     (select i.uuid as t1uid,
             m3.short_id as short_id_start,
             m.text_value as original_date,
             TO_TIMESTAMP( trim ( substring(m.text_value, 1, 19)), 'YYYY-MM-DD"T"HH24:MI:SS' ) as tms 
      from item i
      inner join metadatavalue m on i.uuid = m.dspace_object_id
      inner join metadatafieldregistry m2 on m.metadata_field_id = m2.metadata_field_id
      inner join metadataschemaregistry m3 on m2.metadata_schema_id = m3.metadata_schema_id
      and m3.short_id ='interaction'
      and m2."element" = 'startts') t1
   join
     (select i.uuid as t2uid,
             m3.short_id as short_id_fail,
             m.text_value as success_status
      from item i
      inner join metadatavalue m on i.uuid = m.dspace_object_id
      inner join metadatafieldregistry m2 on m.metadata_field_id = m2.metadata_field_id
      inner join metadataschemaregistry m3 on m2.metadata_schema_id = m3.metadata_schema_id
      and m3.short_id ='interaction'
      and m2."element" = 'status'
      and m.text_value = 'SUCCESS') t2 on t1.t1uid = t2.t2uid
   join
     (select i.uuid as t3uid,
             m3.short_id as short_id_start_endt,
             TO_TIMESTAMP( trim ( substring(m.text_value, 1, 19)), 'YYYY-MM-DD"T"HH24:MI:SS' )as enddate
      from item i
      inner join metadatavalue m on i.uuid = m.dspace_object_id
      inner join metadatafieldregistry m2 on m.metadata_field_id = m2.metadata_field_id
      inner join metadataschemaregistry m3 on m2.metadata_schema_id = m3.metadata_schema_id
      and m3.short_id ='interaction'
      and m2."element" = 'endts') t3 on t3.t3uid = t2.t2uid
   left join
     (select m.text_value as "Protocollo",
             i.uuid as t4uid
      from item i
      inner join metadatavalue m on i.uuid = m.dspace_object_id
      inner join metadatafieldregistry m2 on m.metadata_field_id = m2.metadata_field_id
      inner join metadataschemaregistry m3 on m2.metadata_schema_id = m3.metadata_schema_id
      where m3.short_id ='interaction'
        and m2."element" = 'protocol' ) t4 on t4.t4uid = t3.t3uid
   left join
     (select i.uuid as t5uid,
             m2.qualifier as qualifier_sender,
             m.text_value as "Sender"
      from item i
      inner join metadatavalue m on i.uuid = m.dspace_object_id
      inner join metadatafieldregistry m2 on m.metadata_field_id = m2.metadata_field_id
      inner join metadataschemaregistry m3 on m2.metadata_schema_id = m3.metadata_schema_id
      where m3.short_id ='interaction'
        and m2."element" = 'application'
        and m2.qualifier in ('sender') ) as t5 on t1.t1uid = t5.t5uid
   left join
     (select i.uuid as t6uid,
             m2.qualifier as qualifier_receiver,
             m.text_value as "Receiver"
      from item i
      inner join metadatavalue m on i.uuid = m.dspace_object_id
      inner join metadatafieldregistry m2 on m.metadata_field_id = m2.metadata_field_id
      inner join metadataschemaregistry m3 on m2.metadata_schema_id = m3.metadata_schema_id
      where m3.short_id ='interaction'
        and m2."element" = 'application'
        and m2.qualifier in ('receiver') ) 
        as t6 on t1.t1uid = t6.t6uid
        where tms > TO_TIMESTAMP('2023-05-18T00:00:00', 'YYYY-MM-DD"T"HH24:MI:SS')

最后一行:where tms > TO_TIMESTAMP('2023-05-18T00:00:00', 'YYYY-MM-DD"T"HH24:MI:SS')是问题所在(没有它代码也能正常工作)
但当我这样做:

create table public.timeline_success as 
... 
same code of before without the last where condition.
select * from public.timeline_fail t1 
where t1.tms > TO_TIMESTAMP('2023-05-18T00:00:00', 'YYYY-MM-DD"T"HH24:MI:SS')

密码有效如何解决这一问题?

w7t8yxp5

w7t8yxp51#

最后一行where tms > TO_TIMESTAMP('2023-05-18T00:00:00', 'YYYY-MM-DD"T"HH24:MI:SS')不是问题的原因,而是暴露问题的触发条件。如果没有这一行,从t1引用的唯一列是t1uid,并且只需要连接到其他表。PostgreSQL跳过对t1的选择列表的其余部分进行求值,因为这些值不会对最终结果集产生影响。当包含最后一个条件时,则必须计算tms
通过修改有问题的数据或修改查询以妥善处理列中的无效数据来解决问题。
下面演示了所描述的行为:

CREATE TABLE table1 (ID integer PRIMARY KEY, test_value text);

CREATE TABLE table2 (ID integer PRIMARY KEY, test_label text);

INSERT INTO table1(id, test_value)
VALUES (1, 'FAIL-05-26T01:00:00'),
       (2, '2023-05-27T02:00:00');

INSERT INTO table2(id, test_label)
VALUES (2, 'Pass');

SELECT *
  FROM (SELECT id AS t1id,
               TO_TIMESTAMP(test_value, 'YYYY-MM-DD"T"HH24:MI:SS') AS tms
          FROM table1) t1
  JOIN (SELECT id AS t2id,
               test_label
          FROM table2) t2 ON t1.t1id = t2.t2id
-- WHERE t1.tms > '2023-05-27'::timestamp;

运行上述返回:
| t1id| TMS| t2id|试验标号|
| - -----|- -----|- -----|- -----|
| 2| 2019 -05-27 02:00:00-04| 2|通过|
在未注解WHERE子句的情况下运行会导致以下异常:

ERROR:  Value must be an integer.invalid value "FAIL" for "YYYY" 

ERROR:  invalid value "FAIL" for "YYYY"
SQL state: 22007
Detail: Value must be an integer.

以下来自EXPLAIN (ANALYZE, VERBOSE)的输出提供了一些关于正在发生的事情的见解。请注意,原始列值是从顺序扫描返回的。对这些值的操作只有在需要时才会进行计算。

Hash Join  (cost=38.58..67.79 rows=1270 width=48) (actual time=0.027..0.029 rows=1 loops=1)
  Output: table1.id, to_timestamp(table1.test_value, 'YYYY-MM-DD"T"HH24:MI:SS'::text), table2.id, table2.test_label
  Inner Unique: true
  Hash Cond: (table1.id = table2.id)
  ->  Seq Scan on public.table1  (cost=0.00..22.70 rows=1270 width=36) (actual time=0.008..0.009 rows=2 loops=1)
        Output: table1.id, table1.test_value
  ->  Hash  (cost=22.70..22.70 rows=1270 width=36) (actual time=0.006..0.007 rows=1 loops=1)
        Output: table2.id, table2.test_label
        Buckets: 2048  Batches: 1  Memory Usage: 17kB
        ->  Seq Scan on public.table2  (cost=0.00..22.70 rows=1270 width=36) (actual time=0.003..0.003 rows=1 loops=1)
              Output: table2.id, table2.test_label
Planning Time: 0.110 ms
Execution Time: 0.050 ms

优雅地处理具有无效时间戳格式的文本的一种方法是创建一个函数,该函数返回成功转换的值,如果发生异常则返回NULL。下面的函数支持这种方法。

CREATE OR REPLACE FUNCTION try_to_timestamp(ts_string text, ts_format text) RETURNS timestamptz
LANGUAGE plpgsql
STABLE LEAKPROOF STRICT PARALLEL SAFE COST 1 AS
$FUNC$
BEGIN
  RETURN TO_TIMESTAMP(ts_string, ts_format);
EXCEPTION
  WHEN OTHERS THEN
    RETURN NULL;
END
$FUNC$;

将对TO_TIMESTAMP的调用替换为对try_to_timestamp的调用后,当遇到无效的时间戳字符串时,查询不再失败。

SELECT *
  FROM (SELECT id AS t1id,
               try_to_timestamp(test_value, 'YYYY-MM-DD"T"HH24:MI:SS') AS tms
          FROM table1) t1
  JOIN (SELECT id AS t2id,
               test_label
          FROM table2) t2 ON t1.t1id = t2.t2id
WHERE t1.tms > '2023-05-27'::timestamp;

| t1id| TMS| t2id|试验标号|
| - -----|- -----|- -----|- -----|
| 2| 2019 -05-27 02:00:00-04| 2|通过|

相关问题