postgresql 有没有可能加快这个postgress / postgis声明?

lawou6xi  于 2023-01-13  发布在  PostgreSQL
关注(0)|答案(1)|浏览(121)

我有一个安装了postgis扩展的postgres数据库,并填充了开放的街道Map数据。
使用以下SQL语句:

SELECT                                                                
    l.osm_id,
    sum(
        st_area(st_intersection(ST_Buffer(l.way, 30), p.way))
        /
        st_area(ST_Buffer(l.way, 30))
    ) as green_fraction
FROM planet_osm_line AS l
INNER JOIN planet_osm_polygon AS p ON ST_Intersects(l.way, ST_Buffer(p.way,30))
WHERE p.natural in ('water') or p.landuse in ('forest') GROUP BY l.osm_id;

我计算"绿色"分数。
我的目标是为每个osm_id创建一个"绿色"分数。
意思是一条路有多少部分靠近水源、森林或类似的地方。
为此:
我在每条路周围创建了一个30米的缓冲区,并计算该缓冲区与附近任何绿色要素之间的交点。
我使用"绿色要素"来引用OpenStreetMap数据库中的多边形,比如公园。
有可能加快计算吗?
有一件事我id是创建2个指数,希望加快计算:

CREATE INDEX way_index_2 on planet_osm_polygon USING gist(way) WHERE "natural" IN ('water','wood','forest','hill','valley');
CREATE INDEX way_index_3 on planet_osm_polygon USING gist(way) WHERE "landuse" IN ('forest');

以下是对此声明的"解释":

EXPLAIN (ANALYZE, BUFFERS) SELECT                                                                
    l.osm_id,
    sum(
        st_area(st_intersection(ST_Buffer(l.way, 30), p.way))
        /
        st_area(ST_Buffer(l.way, 30))
    ) as green_fraction
FROM planet_osm_line AS l
INNER JOIN planet_osm_polygon AS p ON ST_Intersects(l.way, ST_Buffer(p.way,30))
WHERE p.natural in ('water') or p.landuse in ('forest') GROUP BY l.osm_id limit 1;
                                                                                       QUERY PLAN                                                                                       
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=17816.83..133789235.22 rows=1 width=16) (actual time=1575643.737..1575651.862 rows=1 loops=1)
   Buffers: shared hit=816315 read=221878, temp read=313661 written=73938
   ->  GroupAggregate  (cost=17816.83..1435958589062981.00 rows=10734420 width=16) (actual time=1575643.723..1575651.847 rows=1 loops=1)
         Group Key: l.osm_id
         Buffers: shared hit=816315 read=221878, temp read=313661 written=73938
         ->  Nested Loop  (cost=17816.83..1433802261939271.50 rows=28652652777 width=448) (actual time=978502.788..1575648.857 rows=8 loops=1)
               Join Filter: st_intersects(l.way, st_buffer(p.way, '30'::double precision, ''::text))
               Rows Removed by Join Filter: 6528525
               Buffers: shared hit=816315 read=221878, temp read=313661 written=73938
               ->  Index Scan using osm_id_idx on planet_osm_line l  (cost=0.44..1242021.57 rows=22671610 width=247) (actual time=5.963..6.181 rows=6 loops=1)
                     Buffers: shared hit=5 read=3
               ->  Materialize  (cost=17816.39..1445364.98 rows=1263812 width=201) (actual time=85.181..4605.942 rows=1088089 loops=6)
                     Buffers: shared hit=380236 read=220348, temp read=313661 written=73938
                     ->  Gather  (cost=17816.39..1403253.92 rows=1263812 width=201) (actual time=510.609..1066.182 rows=1250378 loops=1)
                           Workers Planned: 4
                           Workers Launched: 4
                           Buffers: shared hit=380236 read=220348
                           ->  Parallel Bitmap Heap Scan on planet_osm_polygon p  (cost=16816.39..1275872.72 rows=315953 width=201) (actual time=447.168..9410.838 rows=250076 loops=5)
                                 Recheck Cond: (("natural" = ANY ('{water,wood,forest,hill,valley}'::text[])) OR (landuse = 'forest'::text))
                                 Rows Removed by Index Recheck: 2554266
                                 Filter: (("natural" = 'water'::text) OR (landuse = 'forest'::text))
                                 Rows Removed by Filter: 53217
                                 Heap Blocks: lossy=1
                                 Buffers: shared hit=380236 read=220348
                                 ->  BitmapOr  (cost=16816.39..16816.39 rows=1554297 width=0) (actual time=491.891..491.893 rows=0 loops=1)
                                       Buffers: shared hit=7797
                                       ->  Bitmap Index Scan on way_index_2  (cost=0.00..7822.79 rows=750359 width=0) (actual time=413.690..413.690 rows=737741 loops=1)
                                             Buffers: shared hit=3758
                                       ->  Bitmap Index Scan on way_index_3  (cost=0.00..8361.69 rows=803938 width=0) (actual time=78.198..78.198 rows=783702 loops=1)
                                             Buffers: shared hit=4039
 Planning Time: 0.315 ms
 Execution Time: 1575673.609 ms
(32 rows)
    • 更新日期:**

下面是这两个表的简短模式;

Table 

"public.planet_osm_line"
       Column       |           Type            | Collation | Nullable | Default | Storage  | Compression | Stats target | Description 
--------------------+---------------------------+-----------+----------+---------+----------+-------------+--------------+-------------
 osm_id             | bigint                    |           |          |         | plain    |             |              | 
 access             | text                      |           |          |         | extended |             |              | 
 addr:housename     | text                      |           |          |         | extended |             |              | 
 addr:housenumber   | text                      |           |          |         | extended |             |              | 
 addr:interpolation | text                      |           |          |         | extended |             |              | 
 admin_level        | text                      |           |          |         | extended |             |              | 
....
natural            | text                      |           |          |         | extended |             |              | 


Indexes:
    "highway_idx" btree (highway)
    "motorway_idx" gist (way) WHERE highway = 'motorway'::text
    "motorway_trunk_primary_secondary_tertiary_unclassified_idx" gist (way) WHERE highway = ANY (ARRAY['motorway'::text, 'trunk'::text, 'primary'::text, 'secondary'::text, 'tertiary'::text, 'unclassified'::text])
    "name_idx" btree (name)
    "osm_id_idx" btree (osm_id)
    "planet_osm_line_osm_id_idx" btree (osm_id)
    "planet_osm_line_way_idx" gist (way)
    "primary_idx" gist (way) WHERE highway = 'primary'::text
    "primary_secondary_idx" gist (way) WHERE highway = ANY (ARRAY['primary'::text, 'secondary'::text])
    "primary_secondary_tertiary_idx" gist (way) WHERE highway = ANY (ARRAY['primary'::text, 'secondary'::text, 'tertiary'::text])
    "primary_secondary_tertiary_unclassified_idx" gist (way) WHERE highway = ANY (ARRAY['primary'::text, 'secondary'::text, 'tertiary'::text, 'unclassified'::text])
    "secondary_idx" gist (way) WHERE highway = 'secondary'::text
    "secondary_tertiary_idx" gist (way) WHERE highway = ANY (ARRAY['secondary'::text, 'tertiary'::text])
    "tertiary_idx" gist (way) WHERE highway = 'tertiary'::text
    "tertiary_secondary_idx" gist (way) WHERE highway = ANY (ARRAY['tertiary'::text, 'unclassified'::text])
    "trunk_idx" gist (way) WHERE highway = 'trunk'::text
    "trunk_primary_secondary_tertiary_unclassified_idx" gist (way) WHERE highway = ANY (ARRAY['trunk'::text, 'primary'::text, 'secondary'::text, 'tertiary'::text, 'unclassified'::text])
    "unclassified_idx" gist (way) WHERE highway = 'unclassified'::text
    "way_idx" gist (way)
    "way_index_1" gist (way)
    "way_index_4" gist (way) WHERE "natural" = ANY (ARRAY['water'::text, 'wood'::text, 'forest'::text, 'hill'::text, 'valley'::text])
    "way_index_5" gist (way) WHERE landuse = 'forest'::text
Triggers:
    planet_osm_line_osm2pgsql_valid BEFORE INSERT OR UPDATE ON planet_osm_line FOR EACH ROW EXECUTE FUNCTION planet_osm_line_osm2pgsql_valid()
Access method: heap

以及:

Table "public.planet_osm_polygon"
       Column       |          Type           | Collation | Nullable | Default | Storage  | Compression | Stats target | Description 
--------------------+-------------------------+-----------+----------+---------+----------+-------------+--------------+-------------
 osm_id             | bigint                  |           |          |         | plain    |             |              | 
 access             | text                    |           |          |         | extended |             |              | 
 addr:housename     | text                    |           |          |         | extended |             |              | 
 addr:housenumber   | text                    |           |          |         | extended |             |              | 
 addr:interpolation | text                    |           |          |         | extended |             |              | 
 admin_level        | text                    |           |          |         | extended |             |              | 
 aerialway          | text                    |           |          |         | extended |             |              | 
 aeroway            | text                    |           |          |         | extended |             |              | 
 amenity            | text                    |           |          |         | extended |             |              | 
 area               | text                    |           |          |         | extended |             |              | 
 barrier            | text                    |           |          |         | extended |             |              | 
landuse            | text                    |           |          |         | extended |             |              | 

Indexes:
    "fuel_toilet_parking_restaurant_idex" gist (way) WHERE amenity = ANY (ARRAY['fuel'::text, 'toilets'::text, 'parking'::text, 'restaurant'::text, 'cafe'::text, 'pub'::text, 'ice_cream'::text, 'biergarten'::text])
    "planet_osm_polygon_osm_id_idx" btree (osm_id)
    "planet_osm_polygon_way_idx" gist (way)
    "viewpoint_attraction_guest_house_idex" gist (way) WHERE tourism = ANY (ARRAY['viewpoint '::text, 'attraction'::text, 'guest_house'::text])
    "way_index_2" gist (way) WHERE "natural" = ANY (ARRAY['water'::text, 'wood'::text, 'forest'::text, 'hill'::text, 'valley'::text])
    "way_index_3" gist (way) WHERE landuse = 'forest'::text
Triggers:
    planet_osm_polygon_osm2pgsql_valid BEFORE INSERT OR UPDATE ON planet_osm_polygon FOR EACH ROW EXECUTE FUNCTION planet_osm_polygon_osm2pgsql_valid()
Access method: heap
    • 更新**:
    • 行星_osm_多边形**
wood               | text                    |           |          |                                                     | extended |             |              | 
 z_order            | integer                 |           |          |                                                     | plain    |             |              | 
 way_area           | real                    |           |          |                                                     | plain    |             |              | 
 way                | geometry(Geometry,3857) |           |          |                                                     | main     |             |              | 
 way_buffer_30      | geometry(Polygon)       |           |          |                                                     | external |             |              | 
 way_buffer_30_area | numeric                 |           |          | generated always as (st_area(way_buffer_30)) stored | main     |             |              |

以及

    • 行星osm线**
way_area           | real                      |           |          |         | plain    |             |              | 
 way                | geometry(LineString,3857) |           |          |         | main     |             |              |
3qpi33ja

3qpi33ja1#

不是一个真正的答案,因为一堆评论太长的评论格式:

alter system set default_toast_compression=lz4;

alter table planet_osm_polygon 
    add column way_buffer_30 geometry, 
    alter column way_buffer_30 set storage external,
    add column way_buffer_30_area numeric 
        generated always as (st_area(way_buffer_30)) stored;
update planet_osm_polygon 
    set way_buffer_30=st_buffer(way,30,'quad_segs=1');
create index osmp_way_buffer_30_gix on planet_osm_polygon 
    using gist(way_buffer_30) with (fillfactor=100);
cluster verbose planet_osm_polygon using osmp_way_buffer_30_gix;

alter table planet_osm_line
    add column way_buffer_30 geometry, 
    alter column way_buffer_30 set storage external,
    add column way_buffer_30_area numeric 
        generated always as (st_area(way_buffer_30)) stored; 
update planet_osm_line
    set way_buffer_30=st_buffer(way,30,'quad_segs=1');
create index osml_way_buffer_30_gix on planet_osm_line
    using gist(way_buffer_30) with (fillfactor=100);
cluster verbose planet_osm_polygon using osml_way_buffer_30_gix;

SELECT                                                                
    l.osm_id,
    sum(st_area(st_intersection(l.way_buffer_30, p.way))
        / l.way_buffer_30_area
    ) as green_fraction
FROM planet_osm_line AS l
INNER JOIN planet_osm_polygon AS p ON ST_Intersects(l.way, p.way_buffer_30)
WHERE p.natural in ('water') or p.landuse in ('forest') GROUP BY l.osm_id;
  1. default_toast_compression=lz4您正在处理多边形,这些多边形可能会被压缩和烤面包。默认的default_toast_compression=pglz通常是slower而不是lz4。注意,您需要在更改设置后强制重写这些表/列,否则它将仅从该点起作为默认值应用,不会影响任何东西。
    1.如果形状足够大/复杂,storage external在检索数据时跳过一个步骤。根据您的PostGIS版本,这可能是值得的。
  2. st_buffer(way,30,'quad_segs=1')你可以调整第三个参数来得到一个更简单的形状,这样更容易比较。默认quad_segs=8会导致你的缓冲区比输入多8倍的顶点。
  3. with (fillfactor=100)我假设它是一个静态源表,当新版本发布时,您会替换它,而不是自己维护当前的表,因此,索引也可以设置为静态的(默认的fillfactor=90是为了考虑插入到表中的新行)。
  4. cluster将表页与索引对齐,从而加快堆读取速度。
    1.您要使用的最终缓冲区形状上的索引将优于您计划使用ST_Buffer()创建缓冲区形状时使用的列上的索引。ST_Transform()也是如此-在索引之前将其应用于列,而不是在查询中。
  5. ST_Subdivide all the Things .
    1.您可以测试ST_DWithin()与当前代码和上面建议的代码相比是否加快了连接速度。请确保使用合适的索引进行测试。请记住,距离单位取决于列SRID -如果使用公制,则会从英制/航海/度的系统中得到异常。
INNER JOIN planet_osm_polygon AS p ON ST_DWithin(l.way,p.way,30)

1.您可以检查partial index是否值得,并查看如果为了查询

create index way_buffer_30_gix on planet_osm_polygon 
using gist(way_buffer_30) with (fillfactor=100)
where natural='water' or landuse='forest';

而缩小范围,它的性能是否会更好
1.只有当这是一个一次性的操作,或者如果您可以克隆目标表,这样您就不必处理其他索引和设置,而这些索引和设置应该可以提高您在该表上更频繁地执行的操作的性能,您还可以使用where子句过滤掉您在此阶段不感兴趣的内容。
1.如果它真的很慢,从两个表中提取行的子集并运行测试,为了得到更可靠的结果,使用pgbench来运行基准测试。

相关问题