循环需要长时间运行吗?
for (i in 1:nrow(petrolStations)) {
k<-i+1
if(k<=nrow(petrolStations)) {
for(j in k:nrow(petrolStations)) {
distancesToStation[i,j] <- ,
as.data.frame(a s.numeric(distm(petrolStations[i, c("lon", "lat")],
petrolStations[j, c("lon", "lat")], fun = distHaversine)/1000))}
}}
我将使用自己的示例数据:
set.seed(2)
y <- data.frame(lon = rnorm(10, mean = -114.4069597, sd = 0.0001),
lat = rnorm(10, mean = 43.660648, sd = 0.0002) )
我猜想您执行双循环的理由是,您不会两次计算每个距离。如果通常使用基本dist
函数,则它提供了低三角形的输出,而不是计算上层三角形。下面的方法模仿了此行为。
nr <- nrow(y)
out <- sapply(seq_len(nr), function(i) {
if (i == nr) return(c(rep(NA_real_, i - 1), 0))
c(rep(NA_real_, i - 1), 0,
geosphere::distHaversine(y[i,,drop = FALSE],
y[(i+1):nr,,drop = FALSE]))
})
out
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
# [1,] 0.000 NA NA NA NA NA NA NA NA NA
# [2,] 15.285 0.000 NA NA NA NA NA NA NA NA
# [3,] 26.943 32.620 0.00 NA NA NA NA NA NA NA
# [4,] 32.500 46.234 26.20 0.00 NA NA NA NA NA NA
# [5,] 31.085 17.949 50.25 63.39 0.00 NA NA NA NA NA
# [6,] 61.315 73.312 44.29 30.08 91.15 0.00 NA NA NA NA
# [7,] 16.503 4.798 29.18 45.20 21.10 71.17 0.00 NA NA NA
# [8,] 10.014 21.336 17.54 25.00 38.90 52.34 20.26 0.000 NA NA
# [9,] 26.722 14.509 31.46 52.13 23.87 75.49 10.71 28.178 0.00 NA
# [10,] 6.114 12.508 23.04 33.73 30.06 61.12 12.05 8.864 21.43 0
任意验证:
geosphere::distHaversine(y[8,], y[2,])
# [1] 21.33617
这比您的代码更快,因为它在矢量化计算上大写:geosphere::distHaversine
可以一次计算多个距离:
- 点之间(如果缺少第二个参数(;
- 在
p1
中的所有点之间,p2
中的相应点(p1
和p2
都具有相同的行数(;或 - 正如我上面所做的那样,一个点在许多方面。
c(rep(NA_real_, i - 1), 0, ...)
是为了确保上层三角形为NA
,对角为0。第一个条件(i==nr
(是一个作弊,以确保我们有一个方形矩阵,最后一列是ALL-NA
和0。
如果您还需要填充的上层三角形:
out[upper.tri(out)] <- t(out)[upper.tri(out)]
out
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
# [1,] 0.000 15.285 26.94 32.50 31.08 61.31 16.503 10.014 26.72 6.114
# [2,] 15.285 0.000 32.62 46.23 17.95 73.31 4.798 21.336 14.51 12.508
# [3,] 26.943 32.620 0.00 26.20 50.25 44.29 29.178 17.539 31.46 23.037
# [4,] 32.500 46.234 26.20 0.00 63.39 30.08 45.201 24.996 52.13 33.730
# [5,] 31.085 17.949 50.25 63.39 0.00 91.15 21.096 38.903 23.87 30.059
# [6,] 61.315 73.312 44.29 30.08 91.15 0.00 71.166 52.336 75.49 61.116
# [7,] 16.503 4.798 29.18 45.20 21.10 71.17 0.000 20.257 10.71 12.052
# [8,] 10.014 21.336 17.54 25.00 38.90 52.34 20.257 0.000 28.18 8.864
# [9,] 26.722 14.509 31.46 52.13 23.87 75.49 10.706 28.178 0.00 21.435
# [10,] 6.114 12.508 23.04 33.73 30.06 61.12 12.052 8.864 21.43 0.000