postgresql 对事务使用索引扫描而不是seq扫描

2ekbmq32  于 2023-11-18  发布在  PostgreSQL
关注(0)|答案(1)|浏览(186)

我有两个表,我想做一个全文搜索。交易和交易行。搜索总是由从和到日期,我可以过滤交易行。
然后,我想使用pkey索引将筛选的事务行与事务连接起来。然后对结果连接应用全文搜索。
问题是我的查询对事务使用顺序扫描。
鉴于此查询

explain analyse with
    t as (select transaction.id, transaction.document as t_doc, transaction_line.document as tl_doc                          from transaction_line
left join transaction on transaction_id = transaction.id
where '2023-12-30' <= date and date <= '2023-12-31')
select id from t where t.t_doc @@ to_tsquery('norwegian', '2400') or
t.tl_doc @@ to_tsquery('norwegian', '2400');

字符串
为什么它会导致对要与事务行连接的事务进行seq扫描。

+-------------------------------------------------------------------------------------------------------------------------------+
|QUERY PLAN                                                                                                                     |
+-------------------------------------------------------------------------------------------------------------------------------+
|Hash Left Join  (cost=1301.31..2576.17 rows=36 width=8) (actual time=26.248..28.903 rows=23 loops=1)                           |
|  Hash Cond: (transaction_line.transaction_id = transaction.id)                                                                |
|  Filter: ((transaction.document @@ '''2400'''::tsquery) OR (transaction_line.document @@ '''2400'''::tsquery))                |
|  Rows Removed by Filter: 305                                                                                                  |
|  ->  Bitmap Heap Scan on transaction_line  (cost=12.79..1286.50 rows=439 width=55) (actual time=0.193..2.598 rows=328 loops=1)|
|        Recheck Cond: (('2023-12-30'::date <= date) AND (date <= '2023-12-31'::date))                                          |
|        Heap Blocks: exact=93                                                                                                  |
|        ->  Bitmap Index Scan on date_idx  (cost=0.00..12.68 rows=439 width=0) (actual time=0.135..0.135 rows=328 loops=1)     |
|              Index Cond: ((date >= '2023-12-30'::date) AND (date <= '2023-12-31'::date))                                      |
|  ->  Hash  (cost=1101.01..1101.01 rows=15001 width=63) (actual time=25.853..25.854 rows=15001 loops=1)                        |
|        Buckets: 16384  Batches: 1  Memory Usage: 1531kB                                                                       |
|        ->  Seq Scan on transaction  (cost=0.00..1101.01 rows=15001 width=63) (actual time=0.025..19.925 rows=15001 loops=1)   |
|Planning Time: 5.917 ms                                                                                                        |
|Execution Time: 30.389 ms                                                                                                      |
+-------------------------------------------------------------------------------------------------------------------------------+


我假设它会产生这个执行计划:

+-------------------------------------------------------------------------------------------------------------------------------------+
|QUERY PLAN                                                                                                                           |
+-------------------------------------------------------------------------------------------------------------------------------------+
|Nested Loop Left Join  (cost=13.07..3293.88 rows=36 width=8) (actual time=0.399..6.457 rows=23 loops=1)                              |
|  Filter: ((transaction.document @@ '''2400'''::tsquery) OR (transaction_line.document @@ '''2400'''::tsquery))                      |
|  Rows Removed by Filter: 305                                                                                                        |
|  ->  Bitmap Heap Scan on transaction_line  (cost=12.79..1286.50 rows=439 width=55) (actual time=0.145..2.198 rows=328 loops=1)      |
|        Recheck Cond: (('2023-12-30'::date <= date) AND (date <= '2023-12-31'::date))                                                |
|        Heap Blocks: exact=93                                                                                                        |
|        ->  Bitmap Index Scan on date_idx  (cost=0.00..12.68 rows=439 width=0) (actual time=0.094..0.094 rows=328 loops=1)           |
|              Index Cond: ((date >= '2023-12-30'::date) AND (date <= '2023-12-31'::date))                                            |
|  ->  Index Scan using transaction_pkey on transaction  (cost=0.29..4.56 rows=1 width=63) (actual time=0.012..0.012 rows=1 loops=328)|
|        Index Cond: (id = transaction_line.transaction_id)                                                                           |
|Planning Time: 1.905 ms                                                                                                              |
|Execution Time: 6.625 ms                                                                                                             |
+-------------------------------------------------------------------------------------------------------------------------------------+


Edit 09.11.23-10:20 ->我们还没有生产数据,所以这15 k个事务估计每个模式每年大约有1000万个事务。大约有20- 3000万个事务行。当我们需要用一个没有漏洞的序列填充每个事务时,产生这些数量需要一点时间。
因此,10 M seq扫描找到不到10%的交易对我来说似乎有点昂贵。

vsdwdz23

vsdwdz231#

优化器似乎高估了索引扫描。考虑将random_page_cost减小或增大到与您的硬件匹配的值。

相关问题