将NA值替换为R中某列中的下一个值



我正在尝试在不产生NA值的情况下,使用lag()函数作为条件来变异数据帧中的列。让我创建一个示例:

df <- data.frame("Score" = as.numeric(c("20", "10", "15", "30", "15", "10")),
"Time" = c("1", "2", "1", "2", "1", "2"),
"Team" = c("A", "A", "B", "B", "C", "C"))

之后,我创建了一个名为Diff的新列,用于计算每个团队的得分差异:

df <- df %>% 
group_by(Team) %>% 
mutate(Diff = Score - lag(Score))

我的问题是,这种方法创建了NA值,很明显:

Score Time  Team   Diff
20     1     A        NA
10     2     A       -10
15     1     B        NA
30     2     B        15
15     1     C        NA
10     2     C        -5

我的目标是在最后做到这一点:

Score Time  Team   Diff
20     1     A       -10
10     2     A       -10
15     1     B        15
30     2     B        15
15     1     C        -5
10     2     C        -5

我再次尝试使用case_when()函数来替换NA作为下一个值,但它也不起作用:

df %>% 
group_by(Team) %>% 
mutate(Diff = Score - lag(Score)) %>% 
mutate(Diff = case_when(
NA ~ lead(Diff)
))

无论如何,我该如何将NA值替换为下一个Diff

非常感谢!

只需在事实之后使用fill()

library(tidyverse)
df <- data.frame("Score" = as.numeric(c("20", "10", "15", "30", "15", "10")),
"Time" = c("1", "2", "1", "2", "1", "2"),
"Team" = c("A", "A", "B", "B", "C", "C"))
df <- df %>% 
group_by(Team) %>% 
mutate(Diff = Score - lag(Score)) %>% 
fill(Diff, .direction = 'up')
df
# output
#   Score Time  Team   Diff
#   <dbl> <chr> <chr> <dbl>
#1    20 1     A       -10
#2    10 2     A       -10
#3    15 1     B        15
#4    30 2     B        15
#5    15 1     C        -5
#6    10 2     C        -5

最新更新