Postgres在不改变结果的参数上的行为完全不同。
-- This query is fast (~ 40 ms)
WITH filtered_scans AS (
SELECT id FROM scans
WHERE scans."createdAt" BETWEEN NOW() - INTERVAL '26 day' AND NOW() AND
scans."applicationId" = '2ce67bbf-d740-4f4e-aaf8-33552d54e482'
)
SELECT "scanId", percentile_disc(0.5) WITHIN GROUP (ORDER BY duration) AS p50
FROM scan_exchanges
WHERE "scanId" IN (SELECT id FROM filtered_scans)
GROUP BY "scanId";
-- This one is slow (~ 2 s)
EXPLAIN ANALYZE WITH filtered_scans AS (
SELECT id FROM scans
WHERE scans."createdAt" BETWEEN NOW() - INTERVAL '27 day' AND NOW() AND
-- ⬆️ the only difference is here
scans."applicationId" = '2ce67bbf-d740-4f4e-aaf8-33552d54e482'
)
SELECT "scanId", percentile_disc(0.5) WITHIN GROUP (ORDER BY duration) AS p50
FROM scan_exchanges
WHERE "scanId" IN (SELECT id FROM filtered_scans)
GROUP BY "scanId";
在这两种情况下,filtered_scans
表是相同的:
| 身份证|
| --------------|
| 2a823963-d019-4066-aab5-fed94b5f05fd|
| 4684e3c7-bbba-4fb4-909c-fc6d512f5555|
| abf56697-2400-452c-a68a-e243d53bafb0|
结果表也是一样的:
| 扫描ID|p50蛋白|
| --------------|--------------|
| 2a823963-d019-4066-aab5-fed94b5f05fd|0,0979999999813735|
| 4684e3c7-bbba-4fb4-909c-fc6d512f5555|0,07500000001164153|
| abf56697-2400-452c-a68a-e243d53bafb0|0,1159999999217689|
表scan_exchanges
在scanId
上有一个名为scan_exchanges_scanId_idx
的索引。表列:
CREATE TABLE "scans" (
"id" UUID NOT NULL,
"createdAt" TIMESTAMP NOT NULL DEFAULT 'CURRENT_TIMESTAMP',
"applicationId" UUID NOT NULL,
PRIMARY KEY ("id"),
INDEX "scans_applicationId_idx" ("applicationId")
)
CREATE TABLE "scan_exchanges" (
"id" UUID NOT NULL,
"duration" DOUBLE PRECISION NOT NULL,
"scanId" UUID NOT NULL,
PRIMARY KEY ("id"),
INDEX "scan_exchanges_scanId_idx" ("scanId"),
CONSTRAINT "scan_exchanges_scanId_fkey" FOREIGN KEY ("scanId") REFERENCES "scans" ("id") ON UPDATE CASCADE ON DELETE CASCADE
)
-- I removed some fields irrelevant to the question, tell me if I removed too much
我尝试在查询上运行ANALYZE EXPLAIN
,Postgres 13.9.0返回了截然不同的结果:
快速查询:
QUERY PLAN
GroupAggregate (cost=26.03..308595.21 rows=818 width=24) (actual time=24.336..26.808 rows=3 loops=1)
Group Key: scan_exchanges."scanId"
-> Nested Loop (cost=26.03..308330.41 rows=50915 width=24) (actual time=23.299..26.374 rows=1414 loops=1)
-> Index Scan using scans_pkey on scans (cost=0.28..1488.47 rows=71 width=16) (actual time=23.211..24.283 rows=3 loops=1)
Filter: (("applicationId" = '2ce67bbf-d740-4f4e-aaf8-33552d54e482'::uuid) AND ("createdAt" <= now()) AND ("createdAt" >= (now() - '26 days'::interval)))
Rows Removed by Filter: 1482
-> Bitmap Heap Scan on scan_exchanges (cost=25.75..4308.74 rows=1298 width=24) (actual time=0.054..0.623 rows=471 loops=3)
Recheck Cond: ("scanId" = scans.id)
Heap Blocks: exact=492
-> Bitmap Index Scan on "scan_exchanges_scanId_idx" (cost=0.00..25.43 rows=1298 width=0) (actual time=0.034..0.034 rows=509 loops=3)
Index Cond: ("scanId" = scans.id)
Planning Time: 0.393 ms
JIT:
Functions: 14
Options: Inlining false, Optimization false, Expressions true, Deforming true
Timing: Generation 2.845 ms, Inlining 0.000 ms, Optimization 1.138 ms, Emission 21.245 ms, Total 25.228 ms
Execution Time: 29.853 ms
查询速度慢:
QUERY PLAN
GroupAggregate (cost=306783.39..307261.53 rows=818 width=24) (actual time=2028.021..2028.437 rows=3 loops=1)
Group Key: scan_exchanges."scanId"
-> Sort (cost=306783.39..306939.36 rows=62389 width=24) (actual time=2027.787..2027.933 rows=1414 loops=1)
Sort Key: scan_exchanges."scanId"
Sort Method: quicksort Memory: 159kB
-> Hash Join (cost=370.11..301814.42 rows=62389 width=24) (actual time=179.534..2027.085 rows=1414 loops=1)
Hash Cond: (scan_exchanges."scanId" = scans.id)
-> Seq Scan on scan_exchanges (cost=0.00..298648.46 rows=1062046 width=24) (actual time=25.341..1843.927 rows=1067166 loops=1)
-> Hash (cost=369.02..369.02 rows=87 width=16) (actual time=0.538..0.540 rows=3 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Bitmap Heap Scan on scans (cost=10.71..369.02 rows=87 width=16) (actual time=0.100..0.534 rows=3 loops=1)
Recheck Cond: ("applicationId" = '2ce67bbf-d740-4f4e-aaf8-33552d54e482'::uuid)
Filter: (("createdAt" <= now()) AND ("createdAt" >= (now() - '27 days'::interval)))
Rows Removed by Filter: 319
Heap Blocks: exact=156
-> Bitmap Index Scan on "scans_applicationId_idx" (cost=0.00..10.69 rows=322 width=0) (actual time=0.042..0.042 rows=322 loops=1)
Index Cond: ("applicationId" = '2ce67bbf-d740-4f4e-aaf8-33552d54e482'::uuid)
Planning Time: 0.689 ms
JIT:
Functions: 19
Options: Inlining false, Optimization false, Expressions true, Deforming true
Timing: Generation 6.831 ms, Inlining 0.000 ms, Optimization 1.793 ms, Emission 23.113 ms, Total 31.737 ms
Execution Time: 2035.574 ms
我没有足够的经验来理解是什么让Postgres创建了两个不同的查询计划。
我该怎么做才能让Postgres使用快速查询计划,而不管scans."createdAt"
上的过滤器是什么?
1条答案
按热度按时间mftmpeh81#
询问关于查询计划器选择的“为什么?”问题可能对解决您的生产问题没有多大帮助:你不希望异常缓慢的查询,但只是有时。这是一种不可预测性,使您的应用程序似乎古怪,侵 eclipse 你的用户的信任。
查询规划器是复杂的野兽,很难解释。
如果你的表已经有了很多变化,那么对表进行真空或分析可能会有帮助。查询规划器使用的统计信息可能会不正常,特别是在一个忙碌的表上。
这里有一个建议,可以使这个查询的计划更可预测。像这样创建一个compound covering BTREE索引。
对索引进行干净的索引范围扫描就可以满足CTE的要求,这样可以使查询计划更简单。