R语言 如何使用Data Science Toolbox对简单地址进行地理编码

f8rj6qna  于 2022-12-06  发布在  其他
关注(0)|答案(2)|浏览(155)

我厌倦了Google的地理编码,决定尝试一种替代方法。数据科学工具包(http://www.datasciencetoolkit.org)允许你对无限数量的地址进行地理编码。R有一个很好的包,作为其函数的 Package 器(CRAN:RDSTK)。该包有一个名为street2coordinates()的函数,与数据科学工具包的地理编码实用程序接口。
但是,如果尝试对 City,Country 这样的简单内容进行地理编码,RDSTK函数street2coordinates()将不起作用。在以下示例中,我将尝试使用该函数获取Phoenix城的纬度和经度:

> require("RDSTK")
> street2coordinates("Phoenix+Arizona+United+States")
[1] full.address
<0 rows> (or 0-length row.names)

数据科学工具包中的实用程序可以完美地工作。下面是给出答案的URL请求:http://www.datasciencetoolkit.org/maps/api/geocode/json?sensor=false&address=Phoenix+Arizona+United+States
我对地理编码多个地址(其中完整的地址和城市名称)感兴趣。我知道数据科学工具包的URL会很好地工作。

如何连接URL并将多个纬度和经度与地址一起放入数据框?

下面是一个示例数据集:

dff <- data.frame(address=c(
  "Birmingham, Alabama, United States",
  "Mobile, Alabama, United States",
  "Phoenix, Arizona, United States",
  "Tucson, Arizona, United States",
  "Little Rock, Arkansas, United States",
  "Berkeley, California, United States",
  "Duarte, California, United States",
  "Encinitas, California, United States",
  "La Jolla, California, United States",
  "Los Angeles, California, United States",
  "Orange, California, United States",
  "Redwood City, California, United States",
  "Sacramento, California, United States",
  "San Francisco, California, United States",
  "Stanford, California, United States",
  "Hartford, Connecticut, United States",
  "New Haven, Connecticut, United States"
  ))
r6hnlfcb

r6hnlfcb1#

就像这样:

library(httr)
library(rjson)

data <- paste0("[",paste(paste0("\"",dff$address,"\""),collapse=","),"]")
url  <- "http://www.datasciencetoolkit.org/street2coordinates"
response <- POST(url,body=data)
json     <- fromJSON(content(response,type="text"))
geocode  <- do.call(rbind,sapply(json,
                                 function(x) c(long=x$longitude,lat=x$latitude)))
geocode
#                                                long      lat
# San Francisco, California, United States -117.88536 35.18713
# Mobile, Alabama, United States            -88.10318 30.70114
# La Jolla, California, United States      -117.87645 33.85751
# Duarte, California, United States        -118.29866 33.78659
# Little Rock, Arkansas, United States      -91.20736 33.60892
# Tucson, Arizona, United States           -110.97087 32.21798
# Redwood City, California, United States  -117.88536 35.18713
# New Haven, Connecticut, United States     -72.92751 41.36571
# Berkeley, California, United States      -122.29673 37.86058
# Hartford, Connecticut, United States      -72.76356 41.78516
# Sacramento, California, United States    -121.55541 38.38046
# Encinitas, California, United States     -116.84605 33.01693
# Birmingham, Alabama, United States        -86.80190 33.45641
# Stanford, California, United States      -122.16750 37.42509
# Orange, California, United States        -117.85311 33.78780
# Los Angeles, California, United States   -117.88536 35.18713

这利用了street2coordinates API的POST接口(在此处进行了说明),它在一个请求中返回所有结果,而不是使用多个GET请求。
缺少Phoenix似乎是street2coordinates API中的一个bug。如果你进入API demo page并尝试“Phoenix,Arizona,United States”,你会得到一个空响应。然而,正如你的例子所显示的,使用他们的“Google风格的Geocoder”* 确实 * 给予了Phoenix的结果。所以这里有一个使用重复GET请求的解决方案。注意,这运行 * 要慢得多 *。

geo.dsk <- function(addr){ # single address geocode with data sciences toolkit
  require(httr)
  require(rjson)
  url      <- "http://www.datasciencetoolkit.org/maps/api/geocode/json"
  response <- GET(url,query=list(sensor="FALSE",address=addr))
  json <- fromJSON(content(response,type="text"))
  loc  <- json['results'][[1]][[1]]$geometry$location
  return(c(address=addr,long=loc$lng, lat= loc$lat))
}
result <- do.call(rbind,lapply(as.character(dff$address),geo.dsk))
result <- data.frame(result)
result
#                                     address         long        lat
# 1        Birmingham, Alabama, United States   -86.801904  33.456412
# 2            Mobile, Alabama, United States   -88.103184  30.701142
# 3           Phoenix, Arizona, United States -112.0733333 33.4483333
# 4            Tucson, Arizona, United States  -110.970869  32.217975
# 5      Little Rock, Arkansas, United States   -91.207356  33.608922
# 6       Berkeley, California, United States   -122.29673  37.860576
# 7         Duarte, California, United States  -118.298662  33.786594
# 8      Encinitas, California, United States  -116.846046  33.016928
# 9       La Jolla, California, United States  -117.876447  33.857515
# 10   Los Angeles, California, United States  -117.885359  35.187133
# 11        Orange, California, United States  -117.853112  33.787795
# 12  Redwood City, California, United States  -117.885359  35.187133
# 13    Sacramento, California, United States  -121.555406  38.380456
# 14 San Francisco, California, United States  -117.885359  35.187133
# 15      Stanford, California, United States    -122.1675   37.42509
# 16     Hartford, Connecticut, United States   -72.763564   41.78516
# 17    New Haven, Connecticut, United States   -72.927507  41.365709
8hhllhi2

8hhllhi22#

ggmap package支持使用Google或Data Science Toolkit进行地理编码,后者支持“Google风格的地理编码器”。

library(ggmap)
result <- geocode(as.character(dff[[1]]), source = "dsk")
print(cbind(dff, result))
#                                     address        lon      lat
# 1        Birmingham, Alabama, United States  -86.80190 33.45641
# 2            Mobile, Alabama, United States  -88.10318 30.70114
# 3           Phoenix, Arizona, United States -112.07404 33.44838
# 4            Tucson, Arizona, United States -110.97087 32.21798
# 5      Little Rock, Arkansas, United States  -91.20736 33.60892
# 6       Berkeley, California, United States -122.29673 37.86058
# 7         Duarte, California, United States -118.29866 33.78659
# 8      Encinitas, California, United States -116.84605 33.01693
# 9       La Jolla, California, United States -117.87645 33.85751
# 10   Los Angeles, California, United States -117.88536 35.18713
# 11        Orange, California, United States -117.85311 33.78780
# 12  Redwood City, California, United States -117.88536 35.18713
# 13    Sacramento, California, United States -121.55541 38.38046
# 14 San Francisco, California, United States -117.88536 35.18713
# 15      Stanford, California, United States -122.16750 37.42509
# 16     Hartford, Connecticut, United States  -72.76356 41.78516
# 17    New Haven, Connecticut, United States  -72.92751 41.36571

相关问题