我想计算数据中给定变量的M、SD、Min和Max。样本数据为:
id-visit x
1-01 5
1-01 1
1-01 6
1-04 NA
1-04 NA
1-04 1
示例代码为:
df1 <- df %>%
group_by(id-visit) %>%
mutate(x_M = mean(x, na.rm = T), x_MIN = min(x, na.rm = T),
x_MAX = max(x, na.rm = T), x_SD = sd(x, na.rm = T))
当我使用此代码时,我会收到以下错误消息。我相信发生这种情况是因为我要求R返回人1-04的平均值,但它们只有1个值,其余的都不见了。
有没有一种方法可以告诉R给那个人分配平均值1、最大值1、最小值1和SD 0?
Warning messages:
1: Problem with `mutate()` input `x_MIN`.
ℹ no non-missing arguments to min; returning Inf
ℹ Input `x_MIN` is `min(x, na.rm = T)`.
ℹ The error occurred in group 499: subject_id = "1-04".
2: In min(x, na.rm = T) :
no non-missing arguments to min; returning Inf
3: Problem with `mutate()` input `x_MAX`.
ℹ no non-missing arguments to max; returning -Inf
ℹ Input `x_MAX` is `max(x, na.rm = T)`.
ℹ The error occurred in group 499: subject_id = "1-04".
4: In max(x, na.rm = T) :
no non-missing arguments to max; returning -Inf
这些都是警告消息。您可以根据列名使用1/0来replace
NA。
library(dplyr)
df %>%
group_by(id.visit) %>%
mutate(x_M = mean(x, na.rm = TRUE), x_MIN = min(x, na.rm = TRUE),
x_MAX = max(x, na.rm = TRUE), x_SD = sd(x, na.rm = TRUE),
across(x_M:x_MAX, tidyr::replace_na, 1),
x_SD = replace(x_SD ,is.na(x_SD), 0))
# id.visit x x_M x_MIN x_MAX x_SD
# <chr> <int> <dbl> <dbl> <dbl> <dbl>
#1 1-01 5 4 1 6 2.65
#2 1-01 1 4 1 6 2.65
#3 1-01 6 4 1 6 2.65
#4 1-04 NA 1 1 1 0
#5 1-04 NA 1 1 1 0
#6 1-04 1 1 1 1 0