r语言 - "Is there a faster alternative for this for loop, where I need to multiply each row once with



循环需要长时间运行吗?

for (i in 1:nrow(petrolStations)) {
k<-i+1
if(k<=nrow(petrolStations)) {
for(j in k:nrow(petrolStations)) {
distancesToStation[i,j] <- ,        
as.data.frame(a s.numeric(distm(petrolStations[i, c("lon", "lat")],
petrolStations[j, c("lon", "lat")], fun = distHaversine)/1000))}
}}

我将使用自己的示例数据:

set.seed(2)
y <- data.frame(lon = rnorm(10, mean = -114.4069597, sd = 0.0001),
                lat = rnorm(10, mean = 43.660648, sd = 0.0002) )

我猜想您执行双循环的理由是,您不会两次计算每个距离。如果通常使用基本dist函数,则它提供了低三角形的输出,而不是计算上层三角形。下面的方法模仿了此行为。

nr <- nrow(y)
out <- sapply(seq_len(nr), function(i) {
  if (i == nr) return(c(rep(NA_real_, i - 1), 0))
  c(rep(NA_real_, i - 1), 0,
    geosphere::distHaversine(y[i,,drop = FALSE],
                             y[(i+1):nr,,drop = FALSE]))
})
out
#         [,1]   [,2]  [,3]  [,4]  [,5]  [,6]  [,7]   [,8]  [,9] [,10]
#  [1,]  0.000     NA    NA    NA    NA    NA    NA     NA    NA    NA
#  [2,] 15.285  0.000    NA    NA    NA    NA    NA     NA    NA    NA
#  [3,] 26.943 32.620  0.00    NA    NA    NA    NA     NA    NA    NA
#  [4,] 32.500 46.234 26.20  0.00    NA    NA    NA     NA    NA    NA
#  [5,] 31.085 17.949 50.25 63.39  0.00    NA    NA     NA    NA    NA
#  [6,] 61.315 73.312 44.29 30.08 91.15  0.00    NA     NA    NA    NA
#  [7,] 16.503  4.798 29.18 45.20 21.10 71.17  0.00     NA    NA    NA
#  [8,] 10.014 21.336 17.54 25.00 38.90 52.34 20.26  0.000    NA    NA
#  [9,] 26.722 14.509 31.46 52.13 23.87 75.49 10.71 28.178  0.00    NA
# [10,]  6.114 12.508 23.04 33.73 30.06 61.12 12.05  8.864 21.43     0

任意验证:

geosphere::distHaversine(y[8,], y[2,])
# [1] 21.33617

这比您的代码更快,因为它在矢量化计算上大写:geosphere::distHaversine可以一次计算多个距离:

  • 点之间(如果缺少第二个参数(;
  • p1中的所有点之间,p2中的相应点(p1p2都具有相同的行数(;或
  • 正如我上面所做的那样,一个点在许多方面。

c(rep(NA_real_, i - 1), 0, ...)是为了确保上层三角形为NA,对角为0。第一个条件(i==nr(是一个作弊,以确保我们有一个方形矩阵,最后一列是ALL-NA和0。

如果您还需要填充的上层三角形:

out[upper.tri(out)] <- t(out)[upper.tri(out)]
out
#         [,1]   [,2]  [,3]  [,4]  [,5]  [,6]   [,7]   [,8]  [,9]  [,10]
#  [1,]  0.000 15.285 26.94 32.50 31.08 61.31 16.503 10.014 26.72  6.114
#  [2,] 15.285  0.000 32.62 46.23 17.95 73.31  4.798 21.336 14.51 12.508
#  [3,] 26.943 32.620  0.00 26.20 50.25 44.29 29.178 17.539 31.46 23.037
#  [4,] 32.500 46.234 26.20  0.00 63.39 30.08 45.201 24.996 52.13 33.730
#  [5,] 31.085 17.949 50.25 63.39  0.00 91.15 21.096 38.903 23.87 30.059
#  [6,] 61.315 73.312 44.29 30.08 91.15  0.00 71.166 52.336 75.49 61.116
#  [7,] 16.503  4.798 29.18 45.20 21.10 71.17  0.000 20.257 10.71 12.052
#  [8,] 10.014 21.336 17.54 25.00 38.90 52.34 20.257  0.000 28.18  8.864
#  [9,] 26.722 14.509 31.46 52.13 23.87 75.49 10.706 28.178  0.00 21.435
# [10,]  6.114 12.508 23.04 33.73 30.06 61.12 12.052  8.864 21.43  0.000

最新更新