我有两个表,我想做一个全文搜索。交易和交易行。搜索总是由从和到日期,我可以过滤交易行。
然后,我想使用pkey索引将筛选的事务行与事务连接起来。然后对结果连接应用全文搜索。
问题是我的查询对事务使用顺序扫描。
鉴于此查询
explain analyse with
t as (select transaction.id, transaction.document as t_doc, transaction_line.document as tl_doc from transaction_line
left join transaction on transaction_id = transaction.id
where '2023-12-30' <= date and date <= '2023-12-31')
select id from t where t.t_doc @@ to_tsquery('norwegian', '2400') or
t.tl_doc @@ to_tsquery('norwegian', '2400');
字符串
为什么它会导致对要与事务行连接的事务进行seq扫描。
+-------------------------------------------------------------------------------------------------------------------------------+
|QUERY PLAN |
+-------------------------------------------------------------------------------------------------------------------------------+
|Hash Left Join (cost=1301.31..2576.17 rows=36 width=8) (actual time=26.248..28.903 rows=23 loops=1) |
| Hash Cond: (transaction_line.transaction_id = transaction.id) |
| Filter: ((transaction.document @@ '''2400'''::tsquery) OR (transaction_line.document @@ '''2400'''::tsquery)) |
| Rows Removed by Filter: 305 |
| -> Bitmap Heap Scan on transaction_line (cost=12.79..1286.50 rows=439 width=55) (actual time=0.193..2.598 rows=328 loops=1)|
| Recheck Cond: (('2023-12-30'::date <= date) AND (date <= '2023-12-31'::date)) |
| Heap Blocks: exact=93 |
| -> Bitmap Index Scan on date_idx (cost=0.00..12.68 rows=439 width=0) (actual time=0.135..0.135 rows=328 loops=1) |
| Index Cond: ((date >= '2023-12-30'::date) AND (date <= '2023-12-31'::date)) |
| -> Hash (cost=1101.01..1101.01 rows=15001 width=63) (actual time=25.853..25.854 rows=15001 loops=1) |
| Buckets: 16384 Batches: 1 Memory Usage: 1531kB |
| -> Seq Scan on transaction (cost=0.00..1101.01 rows=15001 width=63) (actual time=0.025..19.925 rows=15001 loops=1) |
|Planning Time: 5.917 ms |
|Execution Time: 30.389 ms |
+-------------------------------------------------------------------------------------------------------------------------------+
型
我假设它会产生这个执行计划:
+-------------------------------------------------------------------------------------------------------------------------------------+
|QUERY PLAN |
+-------------------------------------------------------------------------------------------------------------------------------------+
|Nested Loop Left Join (cost=13.07..3293.88 rows=36 width=8) (actual time=0.399..6.457 rows=23 loops=1) |
| Filter: ((transaction.document @@ '''2400'''::tsquery) OR (transaction_line.document @@ '''2400'''::tsquery)) |
| Rows Removed by Filter: 305 |
| -> Bitmap Heap Scan on transaction_line (cost=12.79..1286.50 rows=439 width=55) (actual time=0.145..2.198 rows=328 loops=1) |
| Recheck Cond: (('2023-12-30'::date <= date) AND (date <= '2023-12-31'::date)) |
| Heap Blocks: exact=93 |
| -> Bitmap Index Scan on date_idx (cost=0.00..12.68 rows=439 width=0) (actual time=0.094..0.094 rows=328 loops=1) |
| Index Cond: ((date >= '2023-12-30'::date) AND (date <= '2023-12-31'::date)) |
| -> Index Scan using transaction_pkey on transaction (cost=0.29..4.56 rows=1 width=63) (actual time=0.012..0.012 rows=1 loops=328)|
| Index Cond: (id = transaction_line.transaction_id) |
|Planning Time: 1.905 ms |
|Execution Time: 6.625 ms |
+-------------------------------------------------------------------------------------------------------------------------------------+
型
Edit 09.11.23-10:20 ->我们还没有生产数据,所以这15 k个事务估计每个模式每年大约有1000万个事务。大约有20- 3000万个事务行。当我们需要用一个没有漏洞的序列填充每个事务时,产生这些数量需要一点时间。
因此,10 M seq扫描找到不到10%的交易对我来说似乎有点昂贵。
1条答案
按热度按时间vsdwdz231#
优化器似乎高估了索引扫描。考虑将
random_page_cost
减小或增大到与您的硬件匹配的值。