我正在尝试使用地圈包中的distHaversine函数找到多个城市之间的距离。此代码需要各种参数:
第一位的经度和纬度。 第二位的经纬度。 以任何单位表示的地球半径(我使用 r = 3961 表示英里)。
当我将其输入为向量时,它很容易工作:
HongKong <- c(114.17, 22.31)
GrandCanyon <- c(-112.11, 36.11)
library(geosphere)
distHaversine(HongKong, GrandCanyon, r=3961)
#[1] 7399.113 distance in miles
但是,我的实际数据集如下所示:
library(dplyr)
location1 <- tibble(person = c("Sally", "Jane", "Lisa"),
current_loc = c("Bogota Colombia", "Paris France", "Hong Kong China"),
lon = c(-74.072, 2.352, 114.169),
lat = c(4.710, 48.857, 22.319))
location2 <- tibble(destination = c("Atlanta United States", "Rome Italy", "Bangkok Thailand", "Grand Canyon United States"),
lon = c(-84.388, 12.496, 100.501, -112.113),
lat = c(33.748, 41.903, 13.756, 36.107))
我想要的是有行来说明每个目的地与该人的当前位置有多远。
我知道必须有一种方法使用咕噜咕噜的 pmap_dbl(),但我无法弄清楚。
如果您的代码使用 tidyverse 并且有任何简单的方法来制作标识最近目的地的列,则加分。谢谢!
在一个理想的世界里,我会得到这个:
solution <- tibble(person = c("Sally", "Jane", "Lisa"),
current_loc = c("Bogota Colombia", "Paris France", "Hong Kong China"),
lon = c(-74.072, 2.352, 114.169),
lat = c(4.710, 48.857, 22.319),
dist_Atlanta = c(1000, 2000, 7000),
dist_Rome = c(2000, 500, 3000),
dist_Bangkok = c(7000, 5000, 1000),
dist_Grand = c(1500, 4000, 7500),
nearest = c("Atlanta United State", "Rome Italy", "Bangkok Thailand"))
注意:dist 列中的数字是随机的;但是,它们将是 distHaversine() 函数的输出。这些列的名称是任意的 - 不需要这样称呼。另外,如果最近的列超出了这个问题的范围,我想我可以弄清楚那个列。
distHaversine
一次只接受一对纬度和纬度值,因此我们需要将location1
行和location2
行的所有组合一一发送到函数。使用sapply
的一种方法是
library(geosphere)
location1[paste0("dist_", stringr::word(location2$destination))] <-
t(sapply(seq_len(nrow(location1)), function(i)
sapply(seq_len(nrow(location2)), function(j) {
distHaversine(location1[i, c("lon", "lat")], location2[j, c("lon", "lat")], r=3961)
})))
location1$nearest <- location2$destination[apply(location1[5:8], 1, which.min)]
location1
# A tibble: 3 x 9
# person current_loc lon lat dist_Atlanta dist_Rome dist_Bangkok dist_Grand nearest
# <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
#1 Sally Bogota Colombia -74.1 4.71 2114. 5828. 11114. 3246. Atlanta United States
#2 Jane Paris France 2.35 48.9 4375. 687. 5871. 5329. Rome Italy
#3 Lisa Hong Kong China 114. 22.3 8380. 5768. 1075. 7399. Bangkok Thailand
按照您的要求,使用tidyverse
map
功能形式purrr
,我找到了解决方案,所有这些都在一条管道中。
library(tidyverse)
library(geosphere)
# renaming lon an lat variables in each df
location1 <- location1 %>%
rename(lon.act = lon, lat.act = lat)
location2 <- location2 %>%
rename(lon.dest = lon, lat.dest = lat)
# geting distances
merge(location1, location2, all = TRUE) %>%
group_by(person,current_loc, destination) %>%
nest() %>%
mutate( act = map(data, `[`, c("lon.act", "lat.act")) %>%
map(as.numeric),
dest = map(data, `[`, c("lon.dest", "lat.dest")) %>%
map(as.numeric),
dist = map2(act, dest, ~distHaversine(.x, .y, r = 3961))) %>%
unnest(data, dist) %>%
group_by(person) %>%
mutate(mindis = dist == min(dist))