我有一个数据框架,其中包含一列纬度和一列经度,如下所示
test <- data.frame("Latitude" = c(45.14565, 45.14565, 45.14565, 45.14565, 33.2222,
31.22122, 31.22122), "Longitude" = c(-105.6666, -105.6666, -105.6666, -104.3333,
-104.3333, -105.77777, -105.77777))
我想让每个值都移到小数点后5位,并检查是否纬度和经度对与上面的对相同,将纬度和经度值都添加0.00001。所以我的数据会变成这样:
test_updated <- data.frame("Latitude" = c(45.14565, 45.14566, 45.14567, 45.14565,
33.22220, 31.22122, 31.22123), "Longitude" = c(-105.66660, -105.66661, -105.66662,
-104.33330, -104.33330, -105.77777, -105.77778))
下面是更新test
中的Latitude
列以重现OP的预期结果的方法:
options(digits = 8) # required to print all significant digits of Longitude
library(data.table)
setDT(test)[, `:=`(Latitude = Latitude + (seq(.N) - 1) * 0.00001,
Longitude = Longitude + (seq(.N) - 1) * 0.00001),
by = .(Latitude, Longitude)]
test
Latitude Longitude 1: 45.14565 -105.66660 2: 45.14566 -105.66659 3: 45.14567 -105.66658 4: 45.14565 -104.33330 5: 33.22220 -104.33330 6: 31.22122 -105.77777 7: 31.22123 -105.77776
比较的
test_updated
Latitude Longitude 1 45.14565 -105.66660 2 45.14566 -105.66661 3 45.14567 -105.66662 4 45.14565 -104.33330 5 33.22220 -104.33330 6 31.22122 -105.77777 7 31.22123 -105.77778
差异是由于OP要求对经纬度值加上0.00001和OP期望的结果,其中0.00001已从负经度值中减去。
编辑
为了再现预期的结果,必须考虑值的符号。不幸的是,对于sign(0)
,基Rsign()
函数返回0。所以,我们用fifelse(x < 0, -1, 1)
代替。
此外,我们可以借鉴Henrik的绝妙想法,使用rowid()
函数来避免分组。
options(digits = 8) # required to print all significant digits of Longitude
library(data.table)
cols <- c("Latitude", "Longitude")
setDT(test)[, (cols) := lapply(.SD, (x) x + fifelse(x < 0, -1, 1) *
(rowidv(.SD, cols) - 1) * 0.00001), .SDcols = cols]
test
Latitude Longitude 1: 45.14565 -105.66660 2: 45.14566 -105.66661 3: 45.14567 -105.66662 4: 45.14565 -104.33330 5: 33.22220 -104.33330 6: 31.22122 -105.77777 7: 31.22123 -105.77778
像往常一样,不需要使用循环:
library(dplyr)
test_updated = test %>%
mutate(
across(c(Latitude, Longitutde),
function(x) if_else(x == lag(x), x+0.00001, x)
)
)
format(round(test_updated, 5), nsmall = 5)
Latitude Longitutde
1 45.14566 -105.66659
2 45.14566 -105.66659
3 45.14566 -105.66659
4 45.14566 -104.33329
5 33.22221 -104.33329
6 31.22123 -105.77776
7 31.22123 -105.77776
不确定我是否理解正确,但也许是这样的?
rm(list=ls())
n <- nrow(test)
test_updated <- data.frame(Latitude = double(n),
Longitude = double(n))
add <- 0.00001
test_updated[1,] <- test[1,]
for (i in 2:nrow(test)){
if(test$Latitude[i-1] == test$Latitude[i] & test$Longitutde[i-1] == test$Longitutde[i]){
test_updated$Latitude[i] <- test$Latitude[i] + add
test_updated$Longitude[i] <- test$Longitutde[i] + add
} else{
test_updated[i,] <- test[i,]
}
}