r语言 - 用更有效的解决方案替换回路



所以我的数据看起来像这样:

test <- structure(list(value = c(0, 781, 1109, 57, 250, 541, 533, 320, 
322, 1033, 291, 2213, 1845, 618, 271, 525, 88, 1354, 217, 820, 
786, 119, 41, 316, 153, 378, 172, 615, 383, 168, 1448, 824, 85, 
224310, 1186, 1488, 244, 368, 133, 488, 118, 4505, 1411, 649, 
690, 548, 226, 393, 1042, 92, 521, 212, 1015, 380, 2944, 54376, 
1396, 429, 2725, 171, 1874, 87, 547, 488, 140, 169, 237, 1749, 
1144, 156, 843, 116, 313, 601, 679, 464, 1092, 178, 28, 57, 550, 
498, 64, 48143, 352, 4100, 232, 1936, 189, 940, 180, 1051, 2917, 
2397, 229, 802, 540, 297, 505, 1649), count = c(1L, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 2L, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, 3L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 4L, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
)), row.names = c(NA, -100L), class = c("tbl_df", "tbl", "data.frame"
))

value有一些随机值,列count主要填充NA。最后我需要的是,count中的每个NA都应该与上一个未NA的相同。所以前几行应该是count == 1的,一旦count更改为2它就应该count == 2。到目前为止,我正在使用循环

for (i in 1:length(test$value))
{
if(isTRUE(is.na(test$count[i]))){
test$count[i] <- test$count[i-1]
}
}

但是,这需要永远!谁能想到更有效的方法来获得与循环相同的结果?这会对我有很大帮助!提前感谢!

您可以使用 tidyr 包中的fill来执行此操作:

tidyr::fill(test, count)
#> # A tibble: 100 x 2
#>    value count
#>    <dbl> <int>
#>  1     0     1
#>  2   781     1
#>  3  1109     1
#>  4    57     1
#>  5   250     1
#>  6   541     1
#>  7   533     1
#>  8   320     1
#>  9   322     1
#> 10  1033     1
#> # ... with 90 more rows

您也可以使用zoo中的na.locf()

library(zoo)
#Code
test$count <- na.locf(test$count)

输出:

# A tibble: 100 x 2
value count
<dbl> <int>
1     0     1
2   781     1
3  1109     1
4    57     1
5   250     1
6   541     1
7   533     1
8   320     1
9   322     1
10  1033     1
# ... with 90 more rows

我们也可以使用

library(zoo)
transform(test, count = na.locf0(count))

或者使用data.tablenafill获得高效版本

library(data.table)
setDT(test)[, count:= nafill(count, type = 'locf')]

-输出

test
#      value count
#  1:      0     1
#  2:    781     1
#  3:   1109     1
#  4:     57     1
#  5:    250     1
#  6:    541     1
#  7:    533     1
#  8:    320     1
#  9:    322     1
# 10:   1033     1
# 11:    291     1
# 12:   2213     1
# 13:   1845     1
# 14:    618     1
# ..

相关内容

最新更新