例如,现在看起来是这样的:
示例 | Col1 | Col2 | >|||||
---|---|---|---|---|---|---|---|
A | 1 | NA | |||||
B | 1 | 2 | C | 0 | 1 | >NA>3 |
df1 <- df
df1[is.na(df1)] <- -Inf
df1[-1] <- matrixStats::rowCummaxs(as.matrix(df1[-1]))* NA^is.na(df[-1])
df1
Sample Col1 Col2 Col3 Col4 Col5
1 A 1 NA 2 2 3
2 B 1 2 NA 2 5
3 C 0 1 5 NA 5
甚至:
df1 <- df
df1[is.na(df1)] <- -Inf
df1[-1] <- matrixStats::rowCummaxs(as.matrix(df1[-1]))
is.na(df1) <- is.na(df)
df1
Sample Col1 Col2 Col3 Col4 Col5
1 A 1 NA 2 2 3
2 B 1 2 NA 2 5
3 C 0 1 5 NA 5
我们可以使用base R
中的cummax
-在数据集的子集上循环,即用apply
(MARGIN = 1
(逐行循环数字列([-1]
(,用值的累积最大值替换非NA元素,并分配回
df[-1] <- t(apply(df[-1], 1, FUN = function(x) {
i1 <- !is.na(x)
x[i1] <- cummax(x[i1])
x}))
-输出
> df
Sample Col1 Col2 Col3 Col4 Col5
1 A 1 NA 2 2 3
2 B 1 2 NA 2 5
3 C 0 1 5 NA 5
数据
df <- structure(list(Sample = c("A", "B", "C"), Col1 = c(1L, 1L, 0L
), Col2 = c(NA, 2L, 1L), Col3 = c(2L, NA, 5L), Col4 = c(1L, 1L,
NA), Col5 = c(3L, 5L, 3L)), class = "data.frame", row.names = c(NA,
-3L))