计算描述性统计信息时出现错误消息



我想计算数据中给定变量的M、SD、Min和Max。样本数据为:

id-visit  x
1-01      5
1-01      1
1-01      6
1-04      NA
1-04      NA
1-04      1

示例代码为:

df1 <- df %>%
group_by(id-visit) %>%
mutate(x_M = mean(x, na.rm = T), x_MIN = min(x, na.rm = T), 
x_MAX = max(x, na.rm = T), x_SD = sd(x, na.rm = T))

当我使用此代码时,我会收到以下错误消息。我相信发生这种情况是因为我要求R返回人1-04的平均值,但它们只有1个值,其余的都不见了。

有没有一种方法可以告诉R给那个人分配平均值1、最大值1、最小值1和SD 0?

Warning messages:
1: Problem with `mutate()` input `x_MIN`.
ℹ no non-missing arguments to min; returning Inf
ℹ Input `x_MIN` is `min(x, na.rm = T)`.
ℹ The error occurred in group 499: subject_id = "1-04". 
2: In min(x, na.rm = T) :
no non-missing arguments to min; returning Inf
3: Problem with `mutate()` input `x_MAX`.
ℹ no non-missing arguments to max; returning -Inf
ℹ Input `x_MAX` is `max(x, na.rm = T)`.
ℹ The error occurred in group 499: subject_id = "1-04". 
4: In max(x, na.rm = T) :
no non-missing arguments to max; returning -Inf

这些都是警告消息。您可以根据列名使用1/0来replaceNA。

library(dplyr)
df %>%
group_by(id.visit) %>%
mutate(x_M = mean(x, na.rm = TRUE), x_MIN = min(x, na.rm = TRUE), 
x_MAX = max(x, na.rm = TRUE), x_SD = sd(x, na.rm = TRUE), 
across(x_M:x_MAX, tidyr::replace_na, 1), 
x_SD = replace(x_SD ,is.na(x_SD), 0))
#  id.visit     x   x_M x_MIN x_MAX  x_SD
#  <chr>    <int> <dbl> <dbl> <dbl> <dbl>
#1 1-01         5     4     1     6  2.65
#2 1-01         1     4     1     6  2.65
#3 1-01         6     4     1     6  2.65
#4 1-04        NA     1     1     1  0   
#5 1-04        NA     1     1     1  0   
#6 1-04         1     1     1     1  0   

相关内容

最新更新