postgresql 在Postgres中使用索引在表中插入大量数据

4smxwvx5 于 2023-11-18 发布在 PostgreSQL

关注(0)|答案(2)|浏览(152)

我在Postgres中有一个5000万行的表tbl。tbl在column_1上有一个索引，并且有很多对这个表的查询，比如

select * from tbl
where column_1 = 'value'

字符串
每个查询返回0-30行，平均10行。
每天一次，我完全更新表中的数据。

delete from tbl;
insert into tbl
select * from tbl_2;
commit;

型
我面临的挑战是查询运行时间太长：大约2-3小时。这可能是因为索引。有没有一种方法可以加快数据更新，并允许用户在更新时查询tbl。如果这很重要-更新过程在python Airflow中运行，查询来自python Web应用程序。

postgresql

来源：https://stackoverflow.com/questions/77438149/insert-large-amount-of-data-in-table-with-index-in-postgres

2条答案

按热度按时间

vulvrdjw1#

生成表的副本：

CREATE TABLE tbl_new (LIKE tbl);

INSERT INTO tbl_new
SELECT * FROM tbl_2;

字符串
然后，在该表上创建所有需要的索引。完成后，切换表：

BEGIN;

DROP TABLE tbl;

ALTER TABLE tbl_new RENAME TO tbl;

COMMIT;

型
很快的。

赞(0）回复(0）举报 2023-11-18

q0qdq0h22#

如果没有任何内容引用此表，则Laurenz'Indiana Jones-style golden idol swap应该足够了

create table tbl_tmp (like tbl); 
insert into tbl_tmp table tbl;
create temp table index_definitions as select indexdef 
  from pg_indexes where schemaname='public' and tablename='tbl';
begin; drop table tbl; alter table tbl_tmp rename to tbl; commit;

do $p$ declare indexdef_ text; 
begin for indexdef_ in select indexdef from index_definitions loop
          execute indexdef_;
      end loop;
      drop table index_definitions;
end $p$;

字符串
这也会从表的先前版本中传输索引定义。
如果你是超级用户，并且想使用黑客解决方案，你可以在插入时禁用索引。如果表被其他对象引用，这可能很有用，因为你保留了原始表，而交换会丢失前一个表并用新表替换，所以所有引用都会中断，需要重新建立：

begin work;

update pg_index set indisvalid=false, indisready=false, indislive=false
where indrelid = 'tbl'::regclass;

delete from tbl;
insert into tbl table tbl_2;

update pg_index set indisvalid=true, indisready=true, indislive=true
where indrelid = 'tbl'::regclass;
reindex table tbl; 
commit work;

型
正如@JonSG所指出的，你可能想考虑一个刚刚刷新的materialized view，甚至是pg_ivm。

赞(0）回复(0）举报 2023-11-18

我来回答

postgresql 在Postgres中使用索引在表中插入大量数据

2条答案

相关问题

热门标签

最新问答