如何创建(使用R的sf)一个空间数据框架,其几何列是其他两个数据框架的几何列之间的差异?

nle07wnf  于 2023-11-14  发布在  其他
关注(0)|答案(1)|浏览(106)

我有两个空间数据框,我已经使用R中的sf包加载。在第一个数据框中,我有(61)个美国县,而在第二个数据框中,我有这些(61)个县的一个子部分。两个数据框具有相同的标识列和这些标识列的相同值,因此对于第一个数据框中的每个县,有一个,而且只有一个,我的目标是:我想获得第三个空间数据框架,其几何列是县与其子部分之间的差异。
我尝试使用以下代码实现上述想法:

library(tidyverse)
library(magrittr)
library(sf)

# Loading shapefiles
dfc <- st_read('data/counties.shp') %>%      # Shapefile with (61) US counties
  arrange(., st, cty)                        # Columns that uniquely identify observations
dfp <- st_read('data/county_parts.shp') %>%  # Shapefile with (61) US county parts
  arrange(., st, cty)                        # Columns that uniquely identify observations

# Geometry difference between each corresponding pair of observations
geom <- map(seq(nrow(dfc)), function(r) 
            st_difference(dfc[r, 'geo_cty'], dfp[r, 'geo_sub'])

字符串
(In在上面的代码中,geo_cty是县几何的列,geo_sub是县几何的子部分。一旦按stcty排序,dfcdfp中的每行r都具有相同的(st, cty)值,因此行geo_cty覆盖了geo_sub。)
上面的代码运行得很顺利,但是我没有成功地创建一个空间数据框架,其几何列是我创建的geom列表。例如,我尝试运行以下代码:

df <- bind_cols(dfc, select(dfp, geo_sub), geom)


我得到了以下错误消息:

Error in `stop_vctrs()`:
! Can't recycle `..1` (size 61) to match `..58` (size 0).
Run `rlang::last_error()` to see where the error occurred.


如何创建所需的空间数据框架?实现我所解释的想法的最佳方法是什么?

kupeojn6

kupeojn61#

我更愿意在这里使用join来确保几何体正确对齐,这也可以处理数据集长度因任何原因而不同的情况(以下示例生成了这样的输入)。Join创建了一个sf对象,其中包含2个几何体列,可以方便地使用map2() + st_difference()进行变异:

library(sf)
#> Linking to GEOS 3.11.2, GDAL 3.6.2, PROJ 9.2.0; sf_use_s2() is TRUE
library(dplyr, warn.conflicts = FALSE)
library(ggplot2)
library(patchwork)

### prepare reprex
# sf example dataset
nc <- 
  read_sf(system.file("shape/nc.shp", package="sf")) |> 
  select(CNTY_ID, NAME)

# reduce the sample to Johnston and neighbouring polygons
nc <- st_filter(nc, st_buffer(nc[nc$NAME == "Johnston",], 10))

# generate "subregions", buffers around centroids;
# drop few records (20%) to simulte a partially matching datasest
set.seed(123)
nc_sub <- nc |>
  select(-NAME) |>
  st_centroid() |>
  st_buffer(10000) |>
  slice_sample(prop = .8)

# -------------------------------------------------------------------------
# join datasets, the other sf object (left_join y arg) must be converted to 
# tibble/data.frame first; result is an sf object with 2 geometry columns 
nc <- nc_sub |>
  st_set_geometry("geo_sub") |>
  as_tibble() |>
  left_join(x = nc, y = _, by = "CNTY_ID")

# find difference between geom columns, 
# store results in 3rd geomerty column (geom_diff)
nc <- nc |>
  mutate(geo_diff = purrr::map2(geometry, geo_sub, st_difference) |> 
                    st_sfc(crs = st_crs(geometry)))

# we can change active geometry and drop other geometry columns to  get a 
# regular single-geometry sf object, if needed:
nc_diff <- st_set_geometry(nc, "geo_diff") |> select(!starts_with("geo"))

字符串
生成的sf对象:

nc_diff
#> Simple feature collection with 8 features and 2 fields
#> Geometry type: GEOMETRY
#> Dimension:     XY
#> Bounding box:  xmin: -79.21665 ymin: 34.55375 xmax: -77.67122 ymax: 36.26004
#> Geodetic CRS:  NAD27
#> # A tibble: 8 × 3
#>   CNTY_ID NAME                                                          geo_diff
#>     <dbl> <chr>                                                   <GEOMETRY [°]>
#> 1    1897 Franklin MULTIPOLYGON (((-78.25455 35.81553, -78.26685 35.84838, -78.…
#> 2    1913 Nash     POLYGON ((-78.20562 35.7254, -78.21155 35.73819, -78.23405 3…
#> 3    1938 Wake     POLYGON ((-78.99881 35.60132, -78.93889 35.76144, -78.94444 …
#> 4    1979 Wilson   POLYGON ((-78.1249 35.59752, -78.16245 35.68298, -78.1618 35…
#> 5    1989 Johnston MULTIPOLYGON (((-78.53874 35.31512, -78.53947 35.33646, -78.…
#> 6    2029 Wayne    POLYGON ((-78.16518 35.19322, -78.2574 35.22095, -78.31013 3…
#> 7    2030 Harnett  POLYGON ((-78.71606 35.25998, -78.81239 35.25872, -78.87457 …
#> 8    2083 Sampson  POLYGON ((-78.11374 34.69918, -78.15676 34.67715, -78.25681 …


可视化:

g1 <- ggplot(nc) + 
  geom_sf() +
  ggtitle("regions") +
  theme_bw()

g2 <- ggplot() + 
  geom_sf(data = st_union(nc), fill = NA) +
  geom_sf(data = nc_sub) +
  ggtitle("sub-regions") +
  theme_bw() +
  theme(axis.text.y = element_blank())

g3 <- ggplot(nc_diff) + 
  geom_sf() +
  ggtitle("st_difference()") +
  theme_bw() +
  theme(axis.text.y = element_blank())

wrap_plots(g1,g2,g3)


x1c 0d1x的数据
输入数据集,nc.shp的一个小子集:

nc
#> Simple feature collection with 8 features and 2 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -79.21665 ymin: 34.55375 xmax: -77.67122 ymax: 36.26004
#> Geodetic CRS:  NAD27
#> # A tibble: 8 × 3
#>   CNTY_ID NAME                                                          geometry
#> *   <dbl> <chr>                                               <MULTIPOLYGON [°]>
#> 1    1897 Franklin (((-78.25455 35.81553, -78.26685 35.84838, -78.30841 35.8934…
#> 2    1913 Nash     (((-78.18693 35.72511, -78.20562 35.7254, -78.21155 35.73819…
#> 3    1938 Wake     (((-78.92107 35.57886, -78.99881 35.60132, -78.93889 35.7614…
#> 4    1979 Wilson   (((-78.06533 35.58204, -78.1249 35.59752, -78.16245 35.68298…
#> 5    1989 Johnston (((-78.53874 35.31512, -78.53947 35.33646, -78.60082 35.4030…
#> 6    2029 Wayne    (((-78.16319 35.18229, -78.16518 35.19322, -78.2574 35.22095…
#> 7    2030 Harnett  (((-78.61274 35.24383, -78.71606 35.25998, -78.81239 35.2587…
#> 8    2083 Sampson  (((-78.11377 34.72099, -78.11374 34.69918, -78.15676 34.6771…


连接后的nc(注意缺少geo_sub记录)和st_difference(),3个几何列:

nc
#> Simple feature collection with 8 features and 2 fields
#> Active geometry column: geometry
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -79.21665 ymin: 34.55375 xmax: -77.67122 ymax: 36.26004
#> Geodetic CRS:  NAD27
#> # A tibble: 8 × 5
#>   CNTY_ID NAME                                geometry                   geo_sub
#> *   <dbl> <chr>                     <MULTIPOLYGON [°]>             <POLYGON [°]>
#> 1    1897 Franklin (((-78.25455 35.81553, -78.26685 3…                     EMPTY
#> 2    1913 Nash     (((-78.18693 35.72511, -78.20562 3… ((-77.92674 35.88657, -7…
#> 3    1938 Wake     (((-78.92107 35.57886, -78.99881 3… ((-78.65973 35.87538, -7…
#> 4    1979 Wilson   (((-78.06533 35.58204, -78.1249 35… ((-77.83458 35.6446, -77…
#> 5    1989 Johnston (((-78.53874 35.31512, -78.53947 3…                     EMPTY
#> 6    2029 Wayne    (((-78.16319 35.18229, -78.16518 3… ((-77.89675 35.34607, -7…
#> 7    2030 Harnett  (((-78.61274 35.24383, -78.71606 3… ((-78.93731 35.44001, -7…
#> 8    2083 Sampson  (((-78.11377 34.72099, -78.11374 3… ((-78.29317 35.05105, -7…
#> # ℹ 1 more variable: geo_diff <GEOMETRY [°]>


创建于2023-11-01使用reprex v2.0.2

相关问题